The xml is quite simple and looks something like this:
- xml
- entry
- name
- adress
- (...)
StreamReader
For a reference, I've started off with a simple stream read:
StreamReader reader = File.OpenText("c:\\testfile2.xml");
string input = null;
while ((input = reader.ReadLine()) != null)
{
Console.WriteLine(input);
}
And the result:
watch.Elapsed = {00:00:02.3910558}
The result is pretty much as expected, about two seconds to read all of the lines an put them on the screen.
XMLReader
Next up is the xml reader. It traverses the xml straight forward and reads all of the nodes.
FileStream stream = new FileStream("c:\\testfile2.xml", FileMode.Open);
XmlReader reader = new XmlTextReader(stream);
while(reader.Read())
{
Console.WriteLine(reader.Value);
}
And the result:
watch.Elapsed = {00:00:12.7904473}
12 seconds is about as expected. There's some overhead with finding the xml nodes, but it all seem pretty much like expected. If we would try this experiment over the internet or a slower file network, I believe that the xml reader overhead would not be that visible.
XPath
Next, we try the xPath aproach:
FileStream stream = new FileStream("c:\\testfile2.xml", FileMode.Open);
XPathDocument document = new XPathDocument(stream);
XPathNavigator navigator = document.CreateNavigator();
XPathNodeIterator node = navigator.Select("xml/entry");
for(...){...}
And the result:
watch.Elapsed = {00:03:29.5681325}
I knew it would take some time, but this is not acceptable. XPath is still a kind of favourite as it makes it possible to navigate the xml tree in a absolutely beautiful way.
DataSet
Next is the dataset approach.
DataSet ds = new DataSet();
ds.ReadXml("C:\\testfile2.xml");
foreach (DataTable tbl in ds.Tables)
{
foreach (DataRow dr in tbl.Rows)
{
for (...){...}
}
}
And the result:
watch.Elapsed = {00:00:03.6352829}
I'm a bit surprised. The dataset seems like the fastest way of traversing an xml file. Note that dataset navigation can be cumbersome when containing a lot of tables, like the one.
A smaller xmlfile
I've also tried the same aproaches to a 5kb xml file, here's the results and now it turns out XPath is the fastest method:
FileStream:
watch.Elapsed = {00:00:00.0144736}
XMLReader:
watch.Elapsed = {00:00:00.0302896}
Xpath:
watch.Elapsed = {00:00:00.0151563}
Dataset:
watch.Elapsed = {00:00:00.0225523}
15 comments:
I'm curious why you didn't try something like:
XmlDocument doc = new XmlDocument();
doc.Load( "c:\\testfile2.xml" );
foreach (XmlNode nd in doc.DocumentElement.SelectNodes( "xml/entry" ))
{
...
}
XmlDocument lets you easily navigate through your xml but it is very resource intensive. If your goal is performance, use something else.
http://msdn2.microsoft.com/en-us/library/ms998559.aspx
What about XmlTextReader?
Your problem is the Console.WriteLine this is so slow that you can not make good measurements.
For me in my projects the XmlReader is up to 25 times faster than the DataSet with XMl Files bigger than 15 MB
Hi was browsing the net for a solution to a problem i am facing....i would like to use Xpath and read an xml file and put the result in a dataset. I am trying to use a gridview to edit it. Is this possible? So far, i have been able to see post the same question at various forums but not with any result.
Let me know if you can help me. you may post the ans, if you have one, in my non-tech blog.
Try VTD-XML (http://vtd-xml.sf.net)
the next generation document centric, all purpose XML parser/indexer
Hi anders!!
I need your opinion!
I´m going to create a web search crawler and i was wondering where to start.. i have made some experiences with SQL and it is a fast DB reading data.. but it will get very heavy soon and then it will be not so fast! So my idea is to record the output search crawler (record the text/images, links etc), from the websites, in XML and then work it out with SQL.. So the real question is .. what is the fastest reading , SQL (from a heavy database) or XML(from multiple files)?
thanks
Ps- if you would like to contact me please use this email: immanouel@sapo.pt
The fastest database in existence that will do exactly what you are asking with little to no coding (it was built for logging data from a web crawler search engine) is at http://www.la-la-land.net. You can bring data in as XML then query against it like SQL...you don't even need to "move it over". It can handle quadrillions of records in less than 1 millisecond reading and writing time. It's indexes have indexes on the indexes. That's why it's so fast...and it's open source with no license. You can use it all you want, resell it even. Good luck but fuk.
Thanks for this great post.
I have also test to get the fasted xml from a file. Only the dataset is very slow at my tests.
Also you forget to use the XmlDocument.Load("path and filename");
Till 20kb the streamreader was the fasted. Above the XmlDocument.Load() was the fasted.
I hopes that this comment is usefull to somebody (-;
This is a nice article..
Its very easy to understand ..
And this article is using to learn something about it..
c#, dot.net, php tutorial
Thanks a lot..!
Hi, I am having a doubt here. I have to parse and update(based on the system configurations) a xml file of about 500KB,which is taking more than 15 minutes using XMLDocument approach.
Please tell me which is the best alternative so that the same can be achieved in 1 or 2 minutes.
Its so easy to understand and to decide also which should I use....very nice article...keep it up...:-)
Hi,
Can anyone tell me the method to read the xml file with XmlTextReader and append it and then again write it on the same file.
Thanks,
Very Very bad article to understand for the beginners :-(
Nice post. Got a lot of important stuff from this blog.
dba kings
Post a Comment