c# - Stop Html Agility Pack from changing source code -


I do not want to change specific text in a group of HTML files and the rest of my code can not be left unchanged I thought I I will use the HTML agility pack. So I wrote the code like this:

  string url = @ "http://www.example.com"; Html web web = new html web (); Web.UserAgent = @ "Mozilla 5.0 (Windows NT 6.3; WA64) Apple WebKat / 537.36 (KHTHML, like GECO) Chrome / 34.0.1847.131 Safari / 537.36"; HtmlDocument Doc = Web. Load (URL); Doc.Save ("a.html");  

But the problem is that the source of the saved website is different from the source, what is the way to stop it from changing the source? Or maybe there is another way to go through DOM and just change specific things (like in Chrome Developer Tools where you can save HTML, Boo automatically later).

------- ---- Edit --------

For example, it appears on eBay. I can not post a link because this ad would be, but if you try this code on any item offer, you will see what is happening.

---------- EDIT2 - ------

It looks like eBay is using iframe, and HAP can not handle it . The inside and the tags are removed so this is probably the reason that the saved website is too much.

HtmlAgilityPack (HAP) will not necessarily be written to the same HTML which reads it. If you check, you will see that writing ( WriteTo method) removes the parsed nodes if the original server sends invalid HTML, then HAP will clean it as part of the parsing.

If you need to save the original, use, and load the saved file with HAP.


Comments

Popular posts from this blog

import - Python ImportError: No module named wmi -

Editing Python Class in Shell and SQLAlchemy -

c# - MySQL Parameterized Select Query joining tables issue -