opaxmlextract is a flexible and powerful tool to process an XML stream. The tool:
• Requires no specific element names to be present in the XML.
• Assumes no hierarchical relationship between elements.
• Allows extracted element values to be output in any order.
• Allows an element's value to be extracted only in the context of another specified element.
• Allows extraction to be suppressed during the scope of specified elements.
opaxmlextract takes the XML input stream from either stdin or a specified input file. opaxmlextract does
not use or require a connection to a fabric.
opaxmlextract works from two lists of elements supplied as command line or input parameters. The first is
a list of elements whose values are to be extracted, called extraction elements. The second is a list of
elements for which extraction is to be suppressed, called suppression elements. When an extraction
element is encountered and extraction is not suppressed, the value of the element is extracted for later
output in an extraction record. An extraction record contains a value for all extraction elements,
including those which have a null value.
When a suppression element is encountered, then no extraction is performed during the extent of that
element, from start through end. Suppression is maintained for elements specified inside the suppression
element, including elements which may happen to match extraction elements. Suppression can be used to
prevent extraction in sections of XML that are present, but not of current interest. For example,
NodeDesc or NodeGUID inside a Neighbor specification of opareport.
opaxmlextract attempts to generate extraction records with data values that are valid at the same time.
Specifying extraction elements that are valid in the same scope produces a single record for each group
of extraction elements. However, mixing extraction elements from different scopes (including different
XML levels) may cause opaxmlextract to produce multiple records.
opaxmlextract outputs an extraction record under the following conditions:
• One or more extraction elements containing a non-null value go out of scope (that is, the element
containing the extraction elements is ended) and a record containing the element values has not
already been output.
• A new and different value is specified for an extraction element and an extraction record
containing the previous value has not already been output.
Element names (extraction or suppression) can be made context-sensitive with an enclosing element name
using the syntax element1.element2. In this case, element2 is extracted (or extraction is suppressed)
only when element2 is enclosed by element1.
The syntax also allows ' * ' to be specified as a wildcard. In this case, *.element3 specifies element3
enclosed by any element or sequence of elements (for example, element1.element3 or
element1.element2.element3). Similarly, element1.*.element3 specifies element3 enclosed by element1 with
any number of (but at least 1) intermediate elements.
opaxmlextract prepends any entered element name not containing a ' * ' (anywhere) with ' *. ', matching
the element regardless of the enclosing elements.
NOTE: Any element names that include a wildcard should be quoted to the shell attempting to wildcard
match against filenames.
At the beginning of operation, opaxmlextract, by default, outputs a delimited header record containing
the names of the extraction elements. The order of the names is the same as specified on the command line
and is the same order as that of the extraction record. Output of the header record can be disabled with
the -H option. By default, element names are shown as they were entered on the command line. The -v
option causes element names to be output as they are used during extraction, with any prepended wildcard
characters.
Options (parameters) to opaxmlextract can be specified on the command line, with a parameter file, or
using both methods. A parameter file is specified with -P param_file. When a parameter file specification
is encountered on the command line, option processing on the command line is suspended, the parameter
file is read and processed entirely, and then command line processing is resumed.
Option syntax within a parameter file is the same as on the command line. Multiple parameter file
specifications can be made, on the command line or within other parameter files. At each point that a
parameter file is specified, current option processing is suspended while the parameter file is
processed, then resumed. Options are processed in the order they are encountered on the command line or
in parameter files. A parameter file can be up to 8192 bytes in size and may contain up to 512
parameters.
Copyright(C) 2015-2019 Intel Corporation opaxmlextract(8)