Posts tagged with xml
Google doesn't seem to be able to find these mailing list archives, so...
- axkit-dev mailing list archive - the only up-to-date archive of axkit-dev@xml.apache.org that I can find.
- axkit-users mailing list archive - this seems to be the best archive of axkit-users@axkit.org. There are others - the official one would be here but it never seems to be up to date at all.
RDF explained in a way that makes sense to XML people
Posted on November 09, 2004 at 12:00 PM
Categories: xml
XSLT code to create XML output from XML in an <xsl:variable>
Posted on July 04, 2004 at 12:00 PM
Categories: code, xml
Here's some code I just wrote in order to parse out XML from a string in XSLT. In XSLT, URL parameters are received in the string type. You can't just copy them into the output because they will be fully entity encoded. So, let's say you wanted to have a form field where someone could enter XHTML, and then you'd store that into an XML file. Of course you don't want to store <p>whatever...</p> into the file. So you have to actually process the incoming string and then create output XML elements as you go. In order to get you started here's some XSLT code that does an OK job of this:
It's pretty easy to call into, just do something like this:
<xsl:call-template name="parseXMLParam">
<xsl:with-param name="input" select="$url_content"/>
</xsl:call-template>
... and that's about it. I think it should generally work for XHTML, but I think that it may fail if there are nested elements with the same name (e.g. a para inside a para) or nested loops of names. However as I said since that doesn't happen much in XHTML it should be useful for that at least.
A recent thread on axkit-users brought me this tip. If you have an XML data store with a large number of nodes, you can use xsl:key to pre-create an index on the table. David Nolan (of CMU) wrote (link not available, sorry)
Basically xsl:key (I mis-remembered the prefix before...) pre-creates an index on the table, based on whatever attributes/nodes you choose. For example, if your xml looks like:
<foo> <bar name="X">...</bar> <bar name="Y">...</bar> ... </foo>And you need to select "/foo/bar[@name='X']", doing so directly is cheap if your XML is small. But if its big you should create a key, especially if you're doing selections of that type often. So before your templates you do:
<xsl:key name="barby_name" match="/foo/bar" use="@name"/>Then when you need a piece of data you use "key('barbyname','X')".
You can even make up a concatenated index, if you need a multi-part search. i.e.:
and then select "key('foo',concat($fooname,'-',$foo_type))". Just make sure the string you're using to separate the search elements isn't valid as part of the content of the elements.Proper application of xsl:key can be very useful. One of our (non-AxKit) translations is taking flat database dumps with approximately 10,000 nodes and converting them to a structured format, based on the structure of our database, with approximately 145,000 nodes. Using xsltproc (which is slower than Saxon, but more widely available) that translation takes 30-40 seconds. Without keys it was taking over 10 minutes.
Read this and then ask me whether or not I'm going to insert the rdf:about attribute into my weblog's RSS 1.0 output. I'm not going to repeat the same damn link URL twice in my data just because some RDF weenies think there's a difference between a URI and a whatever even if they are both the same, character-for-character. Since the name "link" just makes more sense, and it's used by more people, I'm using that.
Taking the pulse of XML editing
It's a good article. Also, I think that XML fans are also people with big vocabularies.
After the first release of Alexandra, I discussed with several people a major flaw, which was that it did not preserve the RNG order as given in the schema when it added or changed elements. I was concerned I would need to use some sort of database update-type scheme. I wasn't keen on this as the only xml update project that seemed to be anywhere near a state of completion was XUpdate and it wasn't very complete, and also seemed like a hassle. I realized later that I could generate XSLT to do the update. More recently I was working on how to pass through the positioning information for both elements that are already instantiated and ones that aren't. I seem to have that completed now. So, at this point I believe it is possible to generate only RNG-valid instances using alexandra.
So, I added a feature to view the docbook source for those documents that actually have it. So far it's only activated in the ICT section but I'll probably turn it on for the other parts as well. The idea is just that you can add ?show_source=YES to the end of the URL and get the DocBook (XML) source. If it's not DocBook, nothing will happen because everything else is just XHTML anyway, you might as well just view the source in the browser.
AxKit already has a way to do this using configuration directives in the apache configuration files (aka .htaccess) but I've been avoiding config directives up to now and using Processor Includes (PIs) instead. It fits with my document-centric view of how this should all work. It wasn't too hard to reproduce the function in XSLT, using a <param> and I just had to add a bit of code to Norman's XSLT for DocBook and little code in another place.
Everytime I read about DocBook there's one thing everyone says, it's complicated. Well I don't get it. DocBook is not complicated. I mean, sure it's got a few hundred different elements, but you don't actually need most of them. You basically need, book, article, section, title, para, ulink, table, and informallist. That's it. All of those, except for section and title, map directly onto XHTML. Section and title, are actually better than HTML once you get used to how they work, and they make the magic that makes DocBook so much better than HTML because it's structured, not just presentational. Then there's the awesome XSLT stylesheets from Norman, so all you really need to do is make a few changes in the params.xsl file and write up a CSS file, and you're set to go. It's got all kinds of nice features like automatic table of contents building, indexing, etc. And if you ever discover you need to add a reference, or a quotation, you just pull out the reference and it explains what to do (with examples). You insert the right tags, and the XSLT does a nice baseline formatting job for you automatically. You can tweak to your heart's content.
So basically if you start writing with DocBook, you get a lot of good stuff for free. It comes with a bunch of good styling transformations.