java - Partial read of xml file -
I need to read 15 lines before about 100 XML files, which is up to 200,000 lines. Which is the way to do it efficiently? The steps mentioned in the use; It tries to parse the whole file at once.
Edit: The first 15 elements are the metadata (page name, last edit date, etc.) about the file, which I would like to parse in the table.
What you probably want to do here - as I wrote in the comment, use the SAX parser and when your position
Edit:
test.xml
& lt ;? Xml version = "1.0" encoding = "UTF-8" & gt; & Lt; Root & gt; & Lt; First & gt; & Lt; Inner & gt; Data & lt; / Inner & gt; & Lt; / First & gt; & Lt; Second & gt; Second & lt; / Second & gt; & Lt; Third> Third & lt; / Third> & Lt; Next & gt; Next & lt; / Next & gt; & Lt; / Root & gt;
ReadXmlUpToSomeElementSaxParser.java
Import javax.xml.parsers.SAXParser; Import javax.xml.parsers.SAXParserFactory; Import org.xml.sax.Attributes; Import org.xml.sax.SAXException; Import org.xml.sax.helpers.DefaultHandler; Public Class ReadXmlUpToSomeElementSaxParser DefaultHandler {Private Final String Last Element Torred; Public ReadXmlUpToSomeElementSaxParser (string last elementtoid) {this.lastElementToRead = last Elementor read; } @ Override public Aooid Start Element (String Yuri, string Lokllaim, string QN, Attributes attributes) throws to Saksikseshn to parse {// bus system Autkprintelan ( "Atteliment:" + Kyuenme); } @ Override public void end Element (String Yuri, string Lokllaim, string qName) throws SAXException (if (last Elementtor Reed.akwayls (QName)) {new Maiksterminetr Akspeshn ();}} public static void main (String [] args do) throws an exception try {SAXParserFactory factory = SAXParserFactory.newInstance (); SAXParser saxParser = factory.newSAXParser (); {SaxParser.parse ( "src / test.xml", new ReadXmlUpToSomeElementSaxParser ( "second"));} catch (MySaxTerminatorException exp) {// do not have to do anything, expected}} Summary Wajnik class MySaxTerminatorException SAXException {}} extended the {/ code> output startElement: root startElement: First start Element: Inner beginning Element: Second < / Code>
Why is it better? Just because some applications give you
Third & lt; / Third> & Lt; Next & gt; Next & lt; / Next & gt; & Lt; / Root & gt;
And line-oriented approach will fail ...
I have provided parses which are not counting elements to show that the condition is to trade Logic can be defined based on the ...
character () Warning
You can read the data in the character ()
method Please note that
Comments
Post a Comment