logo
Free, unlimited AI code reviews that run on commit
git-lrc git-lrc GitHub Install Now We'd appreciate a star git-lrc - Free, unlimited AI code reviews that run on commit | Product Hunt git-lrc - Free, unlimited AI code reviews that run on commit | Product Hunt

XML::GRDDL - transform XML and XHTML to RDF

Author

       Toby Inkster <tobyink@cpan.org>.

Bugs

       Please report any bugs to <http://rt.cpan.org/>.

       Known limitations:

       •   Recursive GRDDL doesn't work yet.

           That is, the profile documents and namespace documents linked to from your  primary  document  cannot
           themselves rely on GRDDL.

Description

       GRDDL is a W3C Recommendation for extracting RDF data from arbitrary XML and XHTML via a transformation,
       typically written in XSLT. See <http://www.w3.org/TR/grddl/> for more details.

       This module implements GRDDL in Perl. It offers both a low level interface, allowing you to generate a
       list of transformations associated with the document being processed, and thus the ability to selectively
       run the transformation; and a high-level interface where a single RDF model is returned representing the
       union of the RDF graphs generated by applying all available transformations.

   Constructor
       "XML::GRDDL->new"
           The constructor accepts no parameters and returns an XML::GRDDL object.

   Methods
       "$grddl->discover($xml, $base, %options)"
           Processes the document to discover the transformations associated with it. $xml is the raw XML source
           of  the  document,  or  an  XML::LibXML::Document object. ($xml cannot be "tag soup" HTML, though you
           should be able to use HTML::HTML5::Parser to parse tag soup into an XML::LibXML::Document.) $base  is
           the base URI for resolving relative references.

           Returns a list of XML::GRDDL::Transformation objects.

           Options include:

           •   force_rel  -  boolean;  interpret  XHTML  rel="transformation"  even  in the absence of the GRDDL
               profile.

           •   strings - boolean; return a list of plain strings instead of blessed objects.

       "$grddl->data($xml, $base, %options)"
           Processes the document, discovers the transformations associated with it, applies the transformations
           and merges the results into a single RDF model. $xml and $base are as per "discover".

           Returns an RDF::Trine::Model containing the data. Statement contexts (a.k.a. named  graphs  /  quads)
           are used to distinguish between data from the result of each transformation.

           Options include:

           •   force_rel  -  boolean;  interpret  XHTML  rel="transformation"  even  in the absence of the GRDDL
               profile.

           •   metadata - boolean; include provenance information in the default graph (a.k.a. nil context).

       "$grddl->ua( [$ua] )"
           Get/set the user agent used for HTTP requests. $ua, if supplied, must be an LWP::UserAgent.

   Constants
       These constants may be exported upon request.

       "GRDDL_NS"
       "XHTML_NS"

Disclaimer Of Warranties

       THIS  PACKAGE  IS  PROVIDED  "AS  IS"  AND  WITHOUT ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, WITHOUT
       LIMITATION, THE IMPLIED WARRANTIES OF MERCHANTIBILITY AND FITNESS FOR A PARTICULAR PURPOSE.

perl v5.40.0                                       2024-11-22                                    XML::GRDDL(3pm)

Features

       XML::GRDDL supports transformations written in XSLT 1.0, and in RDF-EASE.

       XML::GRDDL is a good HTTP citizen: Referer headers are  included  in  requests,  and  appropriate  Accept
       headers  supplied.  To be an even better citizen, I recommend changing the User-Agent header to advertise
       the name of the application:

        $grddl->ua->default_header(user_agent => 'MyApp/1.0 ');

       Provenance  information  for  GRDDL  transformations  is  returned  using   the   GRDDL   vocabulary   at
       <http://www.w3.org/2003/g/data-view#>.

       Certain XHTML profiles and XML namespaces known not to contain any transformations, or to contain useless
       transformations are skipped. See XML::GRDDL::Namespace and XML::GRDDL::Profile for details. In particular
       profiles  for  RDFa  and  many Microformats are skipped, as RDF::RDFa::Parser and HTML::Microformats will
       typically yield far superior results.

Name

       XML::GRDDL - transform XML and XHTML to RDF

See Also

       XML::GRDDL::Transformation,                  XML::GRDDL::Namespace,                  XML::GRDDL::Profile,
       XML::GRDDL::Transformation::RDF_EASE::Functional, XML::Saxon::XSLT2.

       HTML::HTML5::Parser, RDF::RDFa::Parser, HTML::Microformats.

       JSON::GRDDL.

       <http://www.w3.org/TR/grddl/>.

       <http://www.perlrdf.org/>.

       This module is derived from Swignition <http://buzzword.org.uk/swignition/>.

Synopsis

       High-level interface:

        my $grddl = XML::GRDDL->new;
        my $model = $grddl->data($xmldoc, $baseuri);
        # $model is an RDF::Trine::Model

       Low-level interface:

        my $grddl = XML::GRDDL->new;
        my @transformations = $grddl->discover($xmldoc, $baseuri);
        foreach my $t (@transformations)
        {
          # $t is an XML::GRDDL::Transformation
          my ($output, $mediatype) = $t->transform($xmldoc);
          # $output is a string of type $mediatype.
        }

See Also