XTRAN — menu
- What Is XTRAN?
- What Can You Do with XTRAN?
- How can your organization benefit
- What Computer Languages Does XTRAN
- XTRAN Examples: Automation of
Code Analysis, Re-engineering, Translation, Generation,
and Text Processing
- XTRAN's Architecture
- XTRAN's Rules
- XTRAN's Pattern Matching and Replacement
- XTRAN's Language Parsers and Renderers;
- @DBG: XTRAN's
- XTRAN Documentation
- XTRAN Training
- Where Did XTRAN Come From?
- Questions or comments? Contact us!
What is XTRAN?
XTRAN is a software engineering meta-tool we have developed that marries compiler and expert system technologies to provide the rule-driven automation of software engineering tasks involving a wide variety of computer languages (see list below). Through manipulation of XTRAN's Internal Representation (XIR) of these languages, specified via its powerful rules language, XTRAN allows you to automate:
- Analysis — both ad hoc and production
- Re-engineering — applying a set of transformations to the code
- Translation to a different language — either the same level, lower level (compilation), or higher level (decompilation)
- Standardization — imposing coding standards and conventions
- Text processing — an example of how XTRAN's expert system power can be applied to essentially any problem domain, by adding a set of primitives for that domain to XTRAN's rules language
By meta-tool, we mean a system for creating software engineering automation tools. XTRAN's powerful rules language, which we call meta-code, provides an extremely productive environment in which to develop such tools. In this tool development environment, minor software engineering tasks can be automated in minutes or hours, significant tasks in days, and major tasks in weeks.
XTRAN's rules language is not a "black art"; it can be learned and used by any competent senior software engineer, and multiplies the engineer's skills and talents, to accomplish much more for the effort expended.
We have already used XTRAN to create a number of such tools for analyzing, re-engineering, translating, and generating computer languages. Your senior software engineers can, after training, quickly create additional software engineering automation tools using XTRAN's rules language.
Computer languages XTRAN can manipulate include:
- Many assemblers (2GLs)
- Third generation languages (3GLs) such as C, C++, COBOL, Fortran, Java, Pascal, and PL/I
- Fourth generation languages (4GLs) such as Natural, RPG, and SAS
- Proprietary languages such as IBM's EDL and Norsk Data's NPL
- Markup languages such as HTML
- Meta-data languages such as XML
- Scripting languages
- Web languages such as Microsoft's C#
- Data base languages such as SQL
- Domain Specific languages
- XTRAN's own rules language (meta-code) — yes, XTRAN rules can themselves analyze, modify, and even create rules!
- Any other parsable computer language
What Can You Do with XTRAN?
XTRAN is capable of automating any software engineering task that can be described in its powerful rules language. We divide such tasks into several broad categories — Analysis, re-engineering, translation and code generation, and text processing.
For analysis, we configure XTRAN with one or more input language parsers but no output language renderers. XTRAN's rules language provides sophisticated analysis of any language supported by an XTRAN parser. This powerful analysis capability can also produce documentation and program descriptive information suitable for input into CASE and modeling systems. In fact, XTRAN's analysis capability, specified through its rules language, allows you to extract any information that is present in the code, at any level of detail or abstraction, and in any form that's required.
XTRAN is usually run on a single module at a time. However, its ability to persist information across runs allows the collection and reporting of system-wide information. This approach is frequently used to perform global analysis of a software system using XTRAN. We often write XTRAN analysis rules that append information from each module to a file, then digest and report the accumulated information.
Analysis with XTRAN can be either production analysis, as part of the normal software engineering cycle, or ad hoc analysis needed to address specific issues.
XTRAN analysis is especially useful in assessing a body of code in terms of:
- Code quality (cyclomatic complexity, structuring, etc.)
- Calling tree
- "Include" dependencies
- Symbol usage (global, local, or both)
- Cloned code
- Dead code
- Conformance to coding standards
- Code defects that can be identified using pattern matching and/or rules
- Code visualization, e.g. as HTML with color coding of statement types
- Forensic code analysis
- Anything you can think of!
Such an assessment is an important prerequisite to any migration or re-engineering project. In addition, ongoing assessment of code provides to programming management the information it needs to keep software development on the right track.
Click here for examples of using XTRAN for analysis.
For re-engineering and code standardization, we configure XTRAN with a parser and a renderer for the same language. You can then use XTRAN's powerful rules language to apply, across a body of code, any set of systematic changes that can be specified using rules. A few examples include:
- Enforce programming standards.
- Structure the code by automatically eliminating "gotos" in many cases, substituting functionally equivalent "if", "else", "for", "do", and "while" constructions.
- Find operating system dependencies and change them to a different operating system or standard library such as Posix.
- Change code to use a different API, such as a graphics library or DBMS.
- Find common low-level patterns of language usage and decompile them into higher-level constructions, thereby raising the level of the code.
- Requalify structure members to reflect changes to a structure's definition.
- Make large numbers of changes in symbol names.
Since XTRAN makes the changes to the code's XIR, which it then renders on output, the changed code is automatically restyled in terms of indentation, curly braces, line breaking, comment tabbing, etc. Since styling parameters are under user control, you can use XTRAN to restyle code as desired.
Click here for examples of using XTRAN for re-engineering.
Translation, Code Generation
For translation or code generation, we configure XTRAN with one or more input language parsers and one or more output language renderers. (You can also use XTRAN's powerful re-engineering capabilities in conjunction with translation; see below for examples.)
Translation combinations we have implemented with XTRAN include:
- Encore (SEL, Gould) 32 assembler to C
- Fortran to C and C++
- HP (Digital, Compaq) PDP-11 MACRO-11 assembler to C
- HP (Digital, Compaq) VAX MACRO assembler to C
- IBM PL.8 / PL.9 / PL/ix to C++
- IBM Series/1 EDL to C
- IBM Series/1 assembler to C
- Intel PL/M to C
- Intel x86 assembler to C
- Motorola 680x assembler to Texas Instruments TMS370 assembler
- NEC 78C10 assembler to C
- Norsk Data NPL to C
- Pascal to C and C++
- PL/I to C
We have in development, or are planning, additional translation combinations of assemblers, 3GLs (including C, C++, COBOL, BASIC, Fortran, RPG, Pascal, PL/I, PL/M, Ada, and Java), 4GLs, markup languages (including HTML), and meta-data languages (including XML). Please contact us for more information.
With each XTRAN translation license, we deliver a standard set of translation rules for the appropriate language combination. After appropriate training, you can enhance or override those rules as needed to address issues specific to the code being translated.
Click here for examples of using XTRAN for translation.
Note: We have developed the IBM Series/1 EDL and Series/1 assembler to C versions of XTRAN in cooperation with Migration Solutions Incorporated (MSI) of Scottsdale, Arizona. MSI have developed EFL (EDL Function Library), a runtime library that supports the C code produced by XTRAN translation from EDL. MSI are also experts in the use of XTRAN to translate EDL and Series/1 assembler. In addition to supporting translation of EDL and assembler applications, MSI also offer an EDX emulator that can provide a quick and relatively painless way to move such applications off the Series/1 hardware.
For text processing, we configure XTRAN with only a meta-code parser, and no language renderers. You can then use the powerful text manipulation capabilities of XTRAN's w. The rules language allows you to read and write text files as desired.
XTRAN's regular expression, delimited list manipulation, and content-addressable data base capabilities, along with the other capabilities of its rules language, make it an extremely powerful text processor.
Click here for examples of using XTRAN for text processing.
Combining XTRAN's Capabilities
Automating a complex software engineering task often requires a combination of XTRAN's analysis, re-engineering, translation, code generation, and text processing capabilities.
Click here for examples of combining XTRAN's capabilities.
Code Quality Monitoring & Remediation
A critical part of running a successful software development operation is to maintain a high level of code quality, and adherence to the shop's coding standards and conventions. Of course, the definition of code quality varies from shop to shop, as do coding standards and conventions.
So an important property of any mechanism used to monitor code quality and remediate quality issues is flexibility — the ability to tailor the code quality analysis and remediation to the shop's definition of that quality, and to the shop's coding standards and conventions.
Monitoring code demographics and quality
A license for any analysis version of XTRAN comes with a wide variety of rules for measuring code "demographics" and quality:
- Statements per module
- Statement type frequencies, by function or module
- Module/function cross-reference, both directions, with optional frequencies
- Function calling tree, both directions, with optional frequencies
COPYdependency tree, both directions, with optional frequencies
- Symbol cross-reference, both directions parameterized for many different reports
- Comment density
- McCabe's Cyclomatic Complexity
- Halstead's Volume
- Knots (
- "Exit" statement count
- Maximum code nesting depth, with optional frequency distribution
- Dead code
- Cloned (copy/paste) code
You can use all of these rules that come with XTRAN "as is", or you can adapt them to your shop's definition of code quality and to your coding standards and conventions. And, after training, you can create XTRAN rules to add your own code demographics and quality analysis automation, working exactly the way you want it to.
We recommend that you inject such code quality and standards adherence analysis into the Software Development Life Cycle as early as possible — at the point where the developer has a clean compile of new or changed code and is ready to submit it to a build for testing. If it gets a passing grade, on it goes; if not, back it goes to the developer while it's still fresh in his/her mind.
We know that the earlier a code defect is caught, the less it costs to fix and the less damage it does; this approach detects as many defects as possible, as early as possible.
Automating code quality remediation
A license for any re-engineering or translation version of XTRAN comes with a wide variety of rules for automating the remediation of code quality issues:
- Structure code — eliminate
gotos by imposing
- Eliminate additional
gotos by "unrolling" them, using local procedures to avoid code duplication
- Combine low-level expressions to a higher, more readable and maintainable level
- Eliminate unneeded code block constructs
- "Flatten" deeply-nested code to a more readable and maintainable form by extracting deep code levels as local procedures
- Convert lengthy
- Eliminate numbered
gotos and "
- Delete dead code
You can use all of these rules that come with XTRAN "as is", or you can adapt them to your shop's definition of code quality and to your coding standards and conventions. And, after training, you can create XTRAN rules to add your own code quality remediation automation, working exactly the way you want it to.
Legacy Modernization & Migration
Many existing legacy software systems represent major investments that must be modernized and/or migrated in order to provide the agility required to remain competitive:
- Unlock the code from a platform and/or language that is proprietary or approaching end of life.
- Allow the use of modern software development tools, to increase the productivity of the software development department.
- Attract and keep the best architects and developers.
- Improve the quality of the code, to reduce maintenance costs and allow timely enhancements of the system.
- Move the code to an Object-Oriented language, explicating latent OO in the process.
- Rationalize disparate systems, to make them work together better and to provide a common software engineering language and platform.
- Prepare systems for transition to a Service Oriented Architecture (SOA), and automate that transition.
The best modernization and/or migration strategy will likely involve one or more of the following alternatives:
- Re-engineer (and possibly translate) the existing code, to improve its quality, repurpose it, rearchitect it for modern use, and/or re-host it onto a newer platform.
- Replace the code with commercial off-the-shelf software ("COTS").
- Totally reimplement the application from the ground up.
When it comes time to modernize and/or migrate your legacy applications, XTRAN can play a vital role in automating virtually every aspect of the process. Achieving a high level of software engineering automation is critical to the success of the project, to reduce the number of bugs introduced, and ultimately, to reduce the risk of failure.
- Automated Analysis
- Verify the accuracy of an existing functional specification against the code, or, in the worst case, extract such a specification from the code itself. Such a specification is required in order to determine the best modernization and/or migration strategy, and then implement that strategy.
- Determine the quality of the legacy code, to decide if it is worth saving, and assess the need for quality improvement prior to modernization or migration.
- Find and extract business rules from the code. If you decide to replace the application with COTS or to totally reimplement it, you must know what business rules to implement in the replacement. If you decide to re-engineer, you will need to know what the business rules are and where they are, so they can be exposed as services to be reused.
- Assess the impact of a legacy modernization project on the code body to be modernized. For example, if you anticipate a change to a DBMS or other third-party product, all calls to that API may have to be changed. Analysis with XTRAN can find and catalog information about all such calls, and assist in determining the best strategy for automating the changes.
- Assess the impact of a port on the code body to be ported. For example, all operating system dependencies may have to be changed. Analysis with XTRAN can find and catalog information about all such dependencies, and assist in determining the best strategy for automating the changes.
- Assess the quality and adherence to coding standards of the re-engineered or ported code, on an ongoing basis.
- Answer specific questions about what's in the code, on an ad hoc basis.
- Automated Re-Engineering
- Improve the quality and maintainability of the code, and raise its level, prior to modernization. This can significantly reduce the negative impact of a modernization effort, as well as improving the quality of the result.
- Expose the business rules in the code as services to be reused. This can include extracting such rules as components.
- Implement an API change, for example a change to a different DBMS or other third-party software product. This can involve changing every call to the product's API; XTRAN rules for automating such changes can actually take advantage of the API usage information catalogued during the analysis phase.
- Implement specific changes to the code, on an ad hoc basis.
- Move disparate systems to a common platform, for easier maintenance and sharing of skills and code.
- Automated Translation
- Move legacy code from a proprietary language to an open language, in order to reduce dependence on a specific vendor.
- Move legacy code from an obsolete language, for which it is increasingly difficult to find experienced programmers and modern development tools, to a modern language for which programmers and modern tools are available.
- Move legacy code from a non-portable language to a portable language, in order to move the application from an older platform, with high price/performance and increasingly higher maintenance costs and difficulty, to a modern, cost-effective platform with lower price/performance and lower maintenance costs.
- Move disparate systems to a common language, for easier maintenance and sharing of skills, tools, and code.
Forensic Code Analysis
In civil and criminal legal proceedings involving computer code, it is sometimes necessary to analyze that code to determine its implications for the legal case. XTRAN's powerful analysis capabilities are ideal for this, and can be "tuned" to specific requirements using XTRAN's rules language.
One problem commonly encountered with such analysis is the sheer bulk of code that must be analyzed, often within a limited amount of time. XTRAN provides the automation to reduce this problem to manageable proportions, saving both time and money.
Forensic code analysis may involve:
- Analyzing the code's architecture. XTRAN can provide this, at any level of abstraction or detail.
- Determining the code's quality. XTRAN provides a number of popular quality measures, and others can be added as needed.
- Comparing two bodies of code to determine if one was copied from the other. XTRAN provides powerful code comparison capabilities, which can be "tuned" to any level of detail.
- Determining whether the code satisfies contractual requirements. XTRAN can be "tuned" to search the code for constructs that imply either satisfaction of, or failure to satisfy, such requirements.
- Determining whether contractually required documentation accurately describes the code. XTRAN can be "tuned" to search for constructs that verify, or fail to verify, the accuracy of the documentation.
- Determining whether the code contains any "back doors", "trap doors", or "bombs". XTRAN's pattern matching facilities can be used to look for common patterns that imply such problems in the code.
Stephen F. Heffner, XTRAN's author, is himself an expert witness, with report, deposition, and trial testimony experience. He has used XTRAN's forensic analysis capabilities in support of his expert witness activities.
Additional XTRAN Applications
XTRAN's capabilities are applicable to a wide variety of additional problems, including Euro, code dialect translations, CASE tool interfaces, verification and enforcement of coding standards and styles, programming training and tutoring, and other rule-driven manipulation of computer languages and text data.
Ultimately, the only limit to the ways you can use XTRAN is your imagination.
How can your organization benefit from XTRAN?
The following XTRAN capabilities are available to all organizations with responsibility for a significant amount of code. Note that all of these capabilities are realized using XTRAN's rules language. XTRAN rule sets already exist for many of these examples, and are delivered with XTRAN. They can be enhanced and adapted by your senior systems programmers (after training), and they can create new rules as needed.
- Automate code analysis —
- Monitor code quality (by your definition) and demographics
- Extract and report code dependencies, including function calling tree, include/COPY tree, and data dependencies
- Locate and quantify dead code
- Locate and quantify cloned code
- Monitor adherence to coding standards and conventions
- Assess impact of changes to APIs
- Report offsets and sizes of structures and unions and their members
- Verify existing functional specifications against the code, or reverse engineer functional specifications from it
- Extract and report code's data and execution architecture
- Extract and format documentation carried in the code's comments
- Analyze and report state transitions at any level
- Detect and report potential bugs based on code patterns
- Display code with filters and/or colorization, to highlight code properties of interest
- Ad hoc analyses as needed
- Automate code re-engineering
(transformations) — examples:
- Improve code quality:
- Structure to remove
GOTOs by imposing
- Decompile low-level constructs to higher-level ones
- Declone code
- Remove dead code
- Refactor deeply nested code to reduce nesting
- Add documentation based on analysis of the code
- Structure to remove
- Retrofit preprocessor definitions and the include/COPY files that declare them
- Impose coding standards and conventions on inherited or acquired code
- Impose API changes on all relevant function calls
- Impose data structure changes on all structure/union member references
- Consolidate inherited or acquired IT assets to common hardware and/or operating system
- Obfuscate/deobfuscate source code for security
- Prepare included files for multiple inclusions
- Ad hoc transformations as needed
- Improve code quality:
- Automate code translation
and generation —
- From lower-level to higher-level (decompilation), such as assembly code to 3GL, or 3GL to 4GL
- From obsolete language to modern one
- From proprietary language to portable one
- Consolidate inherited or acquired IT assets to common language
- Compile Domain Specific and special purpose languages to lower level languages
- Index and cross-link documents in markup languages such as HTML
- Move code from one dialect to another, or eliminate
- Automate sophisticated text
processing — examples:
- Delimited list manipulation
- Regular expressions, including group matching and replacement
- Analyze and report state transitions
- Built-in data base for storing intermediate results
- XTRAN capabilities shown above, for your code
- Presale — automate code analysis needed to assess impact of moving customer to your hardware, operating system, languages, and/or APIs
- Automate re-engineering and/or translation needed to help customers move applications to your hardware, operating system, languages, and/or APIs
- Provide code analysis & re-engineering "black boxes" for your languages
Independent software vendor (ISV):
- XTRAN capabilities shown above, for your code
- Presale — automate code analysis needed to assess impact of moving customer to your operating system, languages, and/or APIs
- Automate re-engineering and/or translation needed to help customers move applications to your operating system, languages, and/or APIs
- Provide code analysis & re-engineering "black boxes" for your languages
Software services / outsource vendor:
- Presale — automate code analysis needed to quote assuming responsibility for your customer's code, including code quality assessment
- XTRAN capabilities shown above, for your customer's code
Enterprise architecture / IT consultant:
- XTRAN capabilities shown above, for your clients' portfolio management, system architecture, and software development
- Awareness of automation trends in software engineering
Expert witness / forensic analyst / law enforcement:
- Automate forensic code analysis
- XTRAN capabilities shown above, for your code
What Computer Languages Does XTRAN Handle?
XTRAN currently accommodates a wide variety of computer languages, including:
- Assemblers (2GLs)
- Encore (SEL, Gould) 32
- HP (Digital, Compaq) PDP-11 MACRO-11
- HP (Digital, Compaq) VAX MACRO
- IBM mainframe (360, 370, system/Z)
- IBM Series/1
- Intel x86
- Motorola 680x
- NEC 78C10
- Texas Instruments TMS370
- 3rd generation languages (3GLs)
- C, including K&R and ANSI dialects
- Fortran, including IBM, Mark III, and VMS dialects
- Pascal, including HP (Apollo) Domain, IBM, Microsoft, OmegaSoft, and VMS dialects
- PL/I, including IBM and VMS dialects
- 4th generation languages (4GLs)
- Adabas Natural
- Proprietary languages
- IBM Series/1 EDL
- IBM PL.8 / PL.9 / PL/ix
- Intel PL/M
- Microsoft C#
- Norsk Data NPL
- Markup languages
- Meta-data languages
- XTRAN's own rules language
XTRAN's modular and language-independent architecture, and its automated language parsing and rendering technologies, make it easy to add new languages. We have additional languages in development; if you don't see your language, please contact us.
XTRAN consists of:
- A powerful and sophisticated rules language, which we call meta-language or meta-code because it is used to define and manipulate other languages.
- A language-independent inference engine for evaluating rules written in meta-code.
- Language-specific parsers that read computer language text files and convert them to:
- A proprietary XTRAN Internal Representation (XIR) that XTRAN uses to represent all computer languages (and meta-code) during its processing, and which meta-code manipulates as its data.
- Language-specific renderers that render and output the processed XIR as computer language text files.
Here's a graphical look at XTRAN's code and data architecture:
The left side represents XTRAN's code, and reflects the sequence of phases XTRAN executes when it runs. The right side represents the data that XTRAN keeps in memory as it runs. The arrows indicate the production and consumption of XIR by XTRAN's various execution phases.
The dark blue parts of the code (language parsers and renderers) are language-specific; the remainder of the code, and XTRAN's rules language, are language-independent.
This unique architecture means that a new language (or language combination) requires only the development of a parser and/or renderer, plus a set of rules, in order to apply the power of XTRAN to a new language manipulation problem.
All of the XIR in XTRAN is organized into code trees. XTRAN has many built-in code trees; you can also create and use your own.
Internally, XTRAN is highly object-oriented. "Computer language" is a data class, as are "parser" and "renderer". XTRAN can be configured with multiple parsers and/or renderers as needed, to handle a mixture of input and/or output languages. (XTRAN is always configured with at least a parser for its own rules language.)
XTRAN's Rules Language (meta-code)
XTRAN's powerful rules language (which we call meta-language or meta-code) is an evaluated, interpretive language. Its syntax is like C, but its semantics are more like Lisp. It includes meta-statements, meta-expressions, meta-variables, meta-functions, and meta-comments.
Interpretive evaluation of meta-code is so fast that it allows processing of very large amounts of code in reasonable time. For example, we recently performed substantial re-engineering and analysis, involving intensive XTRAN rules evaluation, on more than 600,000 code lines of PL/I in just over four hours on a laptop computer.
The many capabilities offered by XTRAN through its rules language include:
- Data types: Integer, real, text, file, expression, and statement. The latter two refer to values consisting of expressions or statements represented as XTRAN Internal Representation (XIR).
- Extensive facilities for controlling rule evaluation timing (early vs. late binding), in terms of XTRAN's phased operation, including the ability to pass unevaluated meta-code around and to selectively force, or protect from, evaluation at the expression and subexpression levels.
- More than 425 built-in meta-functions for manipulating XIR of computer languages and other data, and for affecting XTRAN's state. Each calling argument can be any meta-expression that evaluates to the appropriate data type.
- Built-in meta-functions include a full set of operator meta-functions, including equivalents to all of C's operators, and n-ary forms of some operators that are binary in "in-fixed operator" languages.
- The ability to extend XTRAN's rules language by creating user meta-functions, written in meta-code, with data-typed parameters. Such user meta-functions are invoked exactly the same as built-in meta-functions. User meta-functions can be recursive.
- Iterator statements ("for", "while", "do") and alternator statements ("if", "else"), to control the results of evaluation.
- Recursive iterators, as part of the rules language, that visit each XIR element (statement, expression, symbol, etc.) once and apply rules to it. This allows you to concentrate on the job at hand, instead of worrying about recursively traversing the code you're manipulating.
- Navigation that allows you, from a particular XIR meta-entity (statement, expression, etc.), to access and possibly change the context in which it occurs, all the way out to the full body of code currently being processed.
- Text manipulation facilities, including extensive facilities for manipulating delimited lists such as comma-separated values.
- The text formatting facilities of C's
- Regular expressions for manipulating text, including "egrep" grouping facilities. You can capture a copy of the text that matched each group, if desired.
- File I/O for creating, appending to, reading, and writing text files.
- Terminal I/O for communicating with the user at run-time via
- Built-in facilities for interactive graphical browsing and exploration of every detail of XIR, including full hypertext browsing capabilities.
- A built-in data base facility that provides n-dimensional, content-addressable data bases for storing XIR and data. Each data base's number of dimensions is unlimited, and each dimension can be subscripted by arbitrary text strings or numbers, or can have no subscript. This facility is extremely useful for organizing code fragments and information, both when analyzing code and when transforming it.
- Powerful pattern matching and replacement facilities, at the statement and expression levels (see below).
- The ability to compare the XIR of two sets of code, with extensive control over what XIR entities are compared and how. For example, you could choose to exclude comments from the comparison. By choosing which entities to exclude, the comparison can be as abstract or detailed as desired. As with XTRAN's pattern matching facilities, comparison of XIR is totally independent of physical aspects of the code such as line breaks, indentation, and comment tabbing.
- The ability to move computer languages (including meta-code) between the symbolic domain (XIR) and the text domain (source code), in both directions, dynamically; in other words, incremental, fully dynamic parsing and rendering. This allows computer languages to be manipulated in both their symbolic (XIR) form and their text form. It also allows rules to write more rules, then parse and evaluate them.
- The ability to embed meta-code in any host language, at both statement and expression levels, and to embed host languages in meta-code.
- Full access to XIR from rules, allowing you to explore the XIR form of code being processed, to any level of detail.
- Code rendering decorations, which are text
strings XTRAN inserts in the output at code rendering
- At start of output
- Before and/or after each statement of a given language, based on a condition
- Before and/or after each expression of a given language, based on a condition
- At end of output
Such decorations could, for example, be HTML tags to emphasize or color statements or expressions based on conditions you specify. Click here for an example.
- Statement output filtering, by language: XTRAN renders only those statements that meet at least one of a series of conditions. This is a good way to restrict XTRAN's rendered output to just what you are interested in.
We have used XTRAN as a meta-tool to create many XTRAN rule sets that automate a wide variety of analysis, re-engineering, and translation tasks. We provide these rules, as appropriate, with each XTRAN license. And, of course, after appropriate training, you can create additional rules to automate both production and ad hoc software engineering tasks.
Note that XTRAN's rules language is proprietary to Pennington; access to it requires an XTRAN license or a Nondisclosure Agreement. Please contact us for more information.
XTRAN's Pattern Matching and Replacement Facilities
XTRAN provides, via its rules language, an extremely powerful suite of pattern matching and replacement facilities you can apply to XIR at both the statement and expression levels:
- You can specify a pattern in a host language, in meta-code, or in a combination of both.
- Any element of a pattern can be "wild".
- You can optionally qualify each such "wild" element using an arbitrarily complex condition comprising additional meta-code to be evaluated for each match attempt. Such a condition can, through navigation, explore the context in which the match is being attempted.
- XTRAN will, if requested, capture a copy of what matched a "wild" element, at match time.
- Such a copy can then be reported, reused later in the same pattern, or reused in a replacement pattern.
- Since pattern matching is done on XIR, it is totally independent of physical aspects of the code such as line breaks, indentation, and comment tabbing.
XTRAN's Language Parsers and Renderers; XBNF
An XTRAN language parser performs the task of reading a computer language's text source code and creating the XTRAN Internal Representation (XIR) that represents that code. XTRAN's parsing engine allows construction of a parser in XTRAN's rules language (meta-code) by describing the grammar to be parsed, using a modified form of Backus-Naur Form (BNF) that we call XTRAN BNF (XBNF). XBNF, as used for parsing, references a small number of hard-coded parsing primitives, and allows recursive productions. You can construct additional parsing primitives in meta-code using XBNF.
Similarly, an XTRAN language renderer performs the task of rendering a computer language's XIR as source code text and putting it out, including all of the code styling issues that implies. XTRAN's language rendering engine allows construction of a renderer in meta-code by describing the grammar to be output, using a rendering version of XBNF. XBNF, as used for rendering, references a small number of hard-coded output primitives, and allows recursive productions. You can construct additional output primitives in meta-code using XBNF.
XBNF includes facilities for parsing and rendering nonpositional (free-format) languages, such as C and PL/I; positional (column-oriented) languages, such as RPG, job control languages, and some assemblers; and languages that are a mixture of positional and nonpositional, such as COBOL.
XTRAN's XBNF-driven parsing and rendering capabilities include fully integrated dialect control, allowing a parser or renderer to be conditioned on the language dialect being parsed. This dialect control is dynamic, allowing you to switch dialects during parsing or rendering if needed.
Unlike "compiler compilers", which generate parsers as their output, XTRAN effectively executes XBNF dynamically at language parsing or rendering time, including a "fastback" parsing feature. XTRAN provides an XBNF trace facility to help debug XBNF.
XBNF is integrated with meta-code, so a language parser or renderer can be written to be "tuned" with additional rules, dynamically if appropriate.
For historical reasons, some of XTRAN's older language parsers and renderers are hard-coded. However, they can be enhanced and/or overridden using parsing or rendering XBNF.
@DBG: XTRAN's Meta-Debugger
XTRAN includes a powerful, full-featured, interactive meta-debugger, called @DBG, which allows you to control XTRAN's execution and debug your meta-code. @DBG's many features include:
- Breaks to @DBG on any of:
- Reference to a specified source or target statement, optionally limited to one or more specified XTRAN processing phases (parse, process, etc.)
- Evaluation of a specified meta-statement
- Occurrence of a specified event
- Start of a specified XTRAN processing phase
- Reference to a specified symbol
- Parse of a specified atom
- Attempt to parse a specified XBNF text element
- Attempt and/or success of a specified statement pattern match/replacement
- Evaluation of XTRAN's built-in "break to @DBG" meta-function anywhere in the rules being evaluated, with optional conditionalization of the break and @DBG commands to be executed
- Break actions (@DBG commands to execute at time of break)
- Meta-variable value change watchpoints, with optional action such as a break to @DBG
- Meta-function call traceback
- Stepping of meta-code evaluation: Over, into, and out of user meta-functions; to a specified meta-statement
- Dynamic meta-code evaluation at @DBG command level
- Command history
- Invocation of XTRAN's graphical, interactive XIR browsing mode
- Indirect @DBG command files
Many of these features are also accessible via XTRAN's command line flags or built-in meta-functions.
When we license and deliver XTRAN, we provide with it a variant of the XTRAN User's Manual appropriate to the licensed activity and computer languages. The XTRAN User's Manual comprises about 50,000 lines of HTML, organized into more than 60 chapters. It provides a thorough reference to XTRAN, with many usage examples.
Since HTML is one of the computer languages XTRAN can analyze, re-engineer, translate, and generate, we use XTRAN to cross link and index its own XTRAN User's Manual, and to produce variants that match licensed XTRAN activities. Click here for a description and example of that process.
We also provide, with XTRAN, a large number of stand-alone examples appropriate to licensed XTRAN activities, including actual XTRAN rules, input, and output.
Note that the XTRAN User's Manual and XTRAN examples are proprietary to Pennington; access to them requires an XTRAN license or a Nondisclosure Agreement. Please contact us for more information.
Where did XTRAN Come From?
In 1983, we needed to port one of our products, XFORM, from the Digital VAX computer to the PC. XFORM was originally written in Digital PDP-11 assembler, but we had previously translated it to VAX assembler, creating our CONPAX translation tool to automate the process. (CONPAX was so successful at this that we brought it to market, and it helped over forty licensees translate millions of lines of assembler code.) So the requirement was to translate the VAX assembler into C.
Our Founder and President, Stephen F. Heffner, hand-translated XFORM's VAX assembler to C. He observed that this process was very tedious and mechanical. However, at the same time, it required significant judgment and the application of sophisticated rules. Since we had already created one tool for automatic translation, he began to think about how this more demanding type of translation could be automated.
In 1984, one of our large multinational clients had a problem — a large body of Digital PDP-11 assembly code needed to be ported to a more modern computer. However, the code had been worked on by many programmers over a long period of time, and was not very well documented, either internally or externally.
They decided that, before they could port the code, they needed to fully understand what it did; in other words, they needed to create an accurate functional specification for it. The problem was that they had very few PDP-11 assembly programmers left by then, and the few they had were heavily committed. So they thought that, if they could somehow get the PDP-11 assembly code into C with equivalent functionality, they could put some of their C programmers to work figuring out and documenting the code.
Since they were using our XFORM product, and had previously used our CONPAX translation tool to port some of their other PDP-11 assembly code to VAX assembler, they thought of us, and came to us with a question: Did we think it would be possible to automate the translation of PDP-11 assembly code to C? Because of our experience with CONPAX, and our recent experience in translating assembler to C by hand, our answer was yes — we thought that was feasible.
Our client then issued an RFP for a feasibility study to us and four other firms, primarily compiler vendors. We and two other vendors responded with bids, and our client funded all three of us.
Our two competitors submitted papers discussing how they would approach the problem. Instead of a paper, we submitted a prototype of XTRAN, as a proof of concept, and successfully demonstrated the automation of assembler translation to C using a rules-based approach.
Our client then funded us to create a full production version of XTRAN. They got an unlimited license to it, but we kept full ownership of the product. With help from us, they then used XTRAN to translate their PDP-11 assembly code to C. Although their original intention was to use XTRAN only as a reverse engineering tool, the resulting C code was good enough that they actually used it for the port.
Since that time, XTRAN has grown tremendously in both language coverage and overall capabilities. It now comprises about half a million code lines of extremely high quality, extremely sophisticated, and highly portable C code. However, as a testament to the robustness of XTRAN's original design and XTRAN Internal Representation (XIR), they both survive to this day, essentially in their original form.