<?xml version="1.0" encoding="UTF-8"?><rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
		>
<channel>
	<title>Comments on: Extracting Links using Xpath</title>
	<atom:link href="http://www.anujgakhar.com/2007/11/21/extracting-links-using-xpath/feed/" rel="self" type="application/rss+xml" />
	<link>http://www.anujgakhar.com/2007/11/21/extracting-links-using-xpath/</link>
	<description>My thoughts on ColdFusion, Flex and other RIA stuff....</description>
	<lastBuildDate>Tue, 07 Feb 2012 10:21:04 +0000</lastBuildDate>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.3</generator>
	<item>
		<title>By: Anuj Gakhar</title>
		<link>http://www.anujgakhar.com/2007/11/21/extracting-links-using-xpath/comment-page-1/#comment-2965</link>
		<dc:creator>Anuj Gakhar</dc:creator>
		<pubDate>Mon, 15 Sep 2008 13:43:06 +0000</pubDate>
		<guid isPermaLink="false">http://www.anujgakhar.com/2007/11/21/extracting-links-using-xpath/#comment-2965</guid>
		<description>Thanks Paradores</description>
		<content:encoded><![CDATA[<p>Thanks Paradores</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Paradores</title>
		<link>http://www.anujgakhar.com/2007/11/21/extracting-links-using-xpath/comment-page-1/#comment-2964</link>
		<dc:creator>Paradores</dc:creator>
		<pubDate>Mon, 15 Sep 2008 12:58:58 +0000</pubDate>
		<guid isPermaLink="false">http://www.anujgakhar.com/2007/11/21/extracting-links-using-xpath/#comment-2964</guid>
		<description>Really handy code - thanks alot</description>
		<content:encoded><![CDATA[<p>Really handy code &#8211; thanks alot</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Anuj Gakhar</title>
		<link>http://www.anujgakhar.com/2007/11/21/extracting-links-using-xpath/comment-page-1/#comment-2958</link>
		<dc:creator>Anuj Gakhar</dc:creator>
		<pubDate>Wed, 10 Sep 2008 14:47:39 +0000</pubDate>
		<guid isPermaLink="false">http://www.anujgakhar.com/2007/11/21/extracting-links-using-xpath/#comment-2958</guid>
		<description>@house dj, that probably means invalid html. whats the URL you are trying to parse ?</description>
		<content:encoded><![CDATA[<p>@house dj, that probably means invalid html. whats the URL you are trying to parse ?</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: house dj</title>
		<link>http://www.anujgakhar.com/2007/11/21/extracting-links-using-xpath/comment-page-1/#comment-2955</link>
		<dc:creator>house dj</dc:creator>
		<pubDate>Wed, 10 Sep 2008 12:55:30 +0000</pubDate>
		<guid isPermaLink="false">http://www.anujgakhar.com/2007/11/21/extracting-links-using-xpath/#comment-2955</guid>
		<description>Hi, it doesn&#039;t work for me. Is this an indicator for an invalid html-code? please say NO! My Webmaster says to me, that my page is valid now! Thanks for your Help.</description>
		<content:encoded><![CDATA[<p>Hi, it doesn&#8217;t work for me. Is this an indicator for an invalid html-code? please say NO! My Webmaster says to me, that my page is valid now! Thanks for your Help.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: ana gomez</title>
		<link>http://www.anujgakhar.com/2007/11/21/extracting-links-using-xpath/comment-page-1/#comment-2799</link>
		<dc:creator>ana gomez</dc:creator>
		<pubDate>Tue, 27 May 2008 01:54:06 +0000</pubDate>
		<guid isPermaLink="false">http://www.anujgakhar.com/2007/11/21/extracting-links-using-xpath/#comment-2799</guid>
		<description>That is very useful code. Thanks</description>
		<content:encoded><![CDATA[<p>That is very useful code. Thanks</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Anuj Gakhar</title>
		<link>http://www.anujgakhar.com/2007/11/21/extracting-links-using-xpath/comment-page-1/#comment-82</link>
		<dc:creator>Anuj Gakhar</dc:creator>
		<pubDate>Tue, 27 Nov 2007 15:31:20 +0000</pubDate>
		<guid isPermaLink="false">http://www.anujgakhar.com/2007/11/21/extracting-links-using-xpath/#comment-82</guid>
		<description>in my example above, I could also do this :-

&lt;cfset allLinks= XmlSearch(myXml,&quot;//*[starts-with(name(),
             &#039;a&#039;) and string-length(name()) = 1]&quot;) /&gt;

instead of 

&lt;cfset allLinks = XmlSearch(myXml, &quot;//*[local-name()=&#039;a&#039;]&quot;)&gt;&gt;

Cool!</description>
		<content:encoded><![CDATA[<p>in my example above, I could also do this :-</p>
<p>&lt;cfset allLinks= XmlSearch(myXml,&#8221;//*[starts-with(name(),<br />
             'a') and string-length(name()) = 1]&#8220;) /&gt;</p>
<p>instead of </p>
<p>&lt;cfset allLinks = XmlSearch(myXml, &#8220;//*[local-name()='a']&#8220;)>&gt;</p>
<p>Cool!</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Anuj Gakhar</title>
		<link>http://www.anujgakhar.com/2007/11/21/extracting-links-using-xpath/comment-page-1/#comment-81</link>
		<dc:creator>Anuj Gakhar</dc:creator>
		<pubDate>Tue, 27 Nov 2007 14:49:05 +0000</pubDate>
		<guid isPermaLink="false">http://www.anujgakhar.com/2007/11/21/extracting-links-using-xpath/#comment-81</guid>
		<description>yeah I saw that in the downloadable code in that example you sent over earlier. Cool stuff! I&#039;ve got something to play with for next couple of days :) 
cheers mate!</description>
		<content:encoded><![CDATA[<p>yeah I saw that in the downloadable code in that example you sent over earlier. Cool stuff! I&#8217;ve got something to play with for next couple of days <img src='http://www.anujgakhar.com/wp-includes/images/smilies/icon_smile.gif' alt=':)' class='wp-smiley' /><br />
cheers mate!</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Dan G. Switzer, II</title>
		<link>http://www.anujgakhar.com/2007/11/21/extracting-links-using-xpath/comment-page-1/#comment-80</link>
		<dc:creator>Dan G. Switzer, II</dc:creator>
		<pubDate>Tue, 27 Nov 2007 14:43:22 +0000</pubDate>
		<guid isPermaLink="false">http://www.anujgakhar.com/2007/11/21/extracting-links-using-xpath/#comment-80</guid>
		<description>Crud, the sHtml string didn&#039;t show up properly. It should be:

sHtml = &quot;[html][body][p]hello world[/p][p]All your base are belong to us![/p][/body][/html]&quot;;

Just replace the brackets ([ and ]) with greater than and less than characters.</description>
		<content:encoded><![CDATA[<p>Crud, the sHtml string didn&#8217;t show up properly. It should be:</p>
<p>sHtml = &#8220;
<pre class="brush: xml; title: ; notranslate">[body][p]hello world[/p][p]All your base are belong to us![/p][/body]</pre>
<p>&#8220;;</p>
<p>Just replace the brackets ([ and ]) with greater than and less than characters.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Dan G. Switzer, II</title>
		<link>http://www.anujgakhar.com/2007/11/21/extracting-links-using-xpath/comment-page-1/#comment-79</link>
		<dc:creator>Dan G. Switzer, II</dc:creator>
		<pubDate>Tue, 27 Nov 2007 14:41:33 +0000</pubDate>
		<guid isPermaLink="false">http://www.anujgakhar.com/2007/11/21/extracting-links-using-xpath/#comment-79</guid>
		<description>@Anuj:

Check out the jTidy Sourceforge forums. There&#039;s a few discussions on the list of various DOM parsers that people have used.

Also, I should clarify and say that I have successfully used the jTidy DOM parser. It&#039;s just not particular robust and can at times be frustrating to work with. I&#039;m hacking this code together from CFC I wrote to give off an example, so I&#039;m not sure it works as-is:

// an HTML string to parse
sHtml = &quot;hello worldAll your base are belong to us!&quot;;

// init jTidy
jTidy = createObject(&quot;java&quot;, &quot;org.w3c.tidy.Tidy&quot;);
jTidyConfigObj = createObject(&quot;java&quot;, &quot;org.w3c.tidy.Configuration&quot;);

// set configuration items
jTidy.setCharEncoding(jTidyConfigObj.utf8);
jTidy.setMakeClean(true);
jTidy.setDropFontTags(true);
jTidy.setXHTML(true);
jTidy.setRawOut(true);
jTidy.setSmartIndent(true);
jTidy.setWord2000(true);
jTidy.setDropEmptyParas(true);
jTidy.setShowWarnings(false);
jTidy.setFixComments(true);
jTidy.setQuiet(true);

// read the HTML string as a Java string
oReadBuffer = CreateObject(&quot;java&quot;,&quot;java.lang.String&quot;).init(sHtml).getBytes();
// convert string to ByteArrayInputStream--which is needed by jTidy
oHtmlBAIS = createobject(&quot;java&quot;,&quot;java.io.ByteArrayInputStream&quot;).init(oReadBuffer);

// do the parsing (take an input stream and make it print pretty)
oTidyDOM = jTidy.parseDOM(oHtmlBAIS, javaCast(&quot;null&quot;, &quot;&quot;));

// close the BAIS stream
oHtmlBAIS.close();

// get all the paragraph tags
oParagraphTags = oTidyDOM.getElementsByTagName(javaCast(&quot;string&quot;, &quot;p&quot;));

The &quot;oParagraphTags&quot; would contain all the p tags in the document.

I&#039;ve actually used jTidy to create a CFC that converts HTML to a Plain Text document. I needed the functionality for e-mailing content from a knowledge base system for non-HTML based clients.</description>
		<content:encoded><![CDATA[<p>@Anuj:</p>
<p>Check out the jTidy Sourceforge forums. There&#8217;s a few discussions on the list of various DOM parsers that people have used.</p>
<p>Also, I should clarify and say that I have successfully used the jTidy DOM parser. It&#8217;s just not particular robust and can at times be frustrating to work with. I&#8217;m hacking this code together from CFC I wrote to give off an example, so I&#8217;m not sure it works as-is:</p>
<p>// an HTML string to parse<br />
sHtml = &#8220;hello worldAll your base are belong to us!&#8221;;</p>
<p>// init jTidy<br />
jTidy = createObject(&#8220;java&#8221;, &#8220;org.w3c.tidy.Tidy&#8221;);<br />
jTidyConfigObj = createObject(&#8220;java&#8221;, &#8220;org.w3c.tidy.Configuration&#8221;);</p>
<p>// set configuration items<br />
jTidy.setCharEncoding(jTidyConfigObj.utf8);<br />
jTidy.setMakeClean(true);<br />
jTidy.setDropFontTags(true);<br />
jTidy.setXHTML(true);<br />
jTidy.setRawOut(true);<br />
jTidy.setSmartIndent(true);<br />
jTidy.setWord2000(true);<br />
jTidy.setDropEmptyParas(true);<br />
jTidy.setShowWarnings(false);<br />
jTidy.setFixComments(true);<br />
jTidy.setQuiet(true);</p>
<p>// read the HTML string as a Java string<br />
oReadBuffer = CreateObject(&#8220;java&#8221;,&#8221;java.lang.String&#8221;).init(sHtml).getBytes();<br />
// convert string to ByteArrayInputStream&#8211;which is needed by jTidy<br />
oHtmlBAIS = createobject(&#8220;java&#8221;,&#8221;java.io.ByteArrayInputStream&#8221;).init(oReadBuffer);</p>
<p>// do the parsing (take an input stream and make it print pretty)<br />
oTidyDOM = jTidy.parseDOM(oHtmlBAIS, javaCast(&#8220;null&#8221;, &#8220;&#8221;));</p>
<p>// close the BAIS stream<br />
oHtmlBAIS.close();</p>
<p>// get all the paragraph tags<br />
oParagraphTags = oTidyDOM.getElementsByTagName(javaCast(&#8220;string&#8221;, &#8220;p&#8221;));</p>
<p>The &#8220;oParagraphTags&#8221; would contain all the p tags in the document.</p>
<p>I&#8217;ve actually used jTidy to create a CFC that converts HTML to a Plain Text document. I needed the functionality for e-mailing content from a knowledge base system for non-HTML based clients.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Anuj Gakhar</title>
		<link>http://www.anujgakhar.com/2007/11/21/extracting-links-using-xpath/comment-page-1/#comment-78</link>
		<dc:creator>Anuj Gakhar</dc:creator>
		<pubDate>Tue, 27 Nov 2007 14:14:07 +0000</pubDate>
		<guid isPermaLink="false">http://www.anujgakhar.com/2007/11/21/extracting-links-using-xpath/#comment-78</guid>
		<description>oh thats very nice! I like the example he demonstrated.....and thansk to you for saving my time as i was going to start testing out jTidy if you didnt send me this example! 

Do you know of any other DOM parsers that can be used ?</description>
		<content:encoded><![CDATA[<p>oh thats very nice! I like the example he demonstrated&#8230;..and thansk to you for saving my time as i was going to start testing out jTidy if you didnt send me this example! </p>
<p>Do you know of any other DOM parsers that can be used ?</p>
]]></content:encoded>
	</item>
</channel>
</rss>

