Python: validate and format JSON files -
I have around 2000 JSON files which I am trying to run through a Python program. A problem occurs when a JSON file is not in the correct format (error: ValueError: a JSON object can not be decoded
) In return, I can not read it in my program.
I am currently doing something like:
For files in the folder: with open (files) f: data = json.load (f) ; # This part can be an error
I know that there are offline methods for validating and formatting JSO files but there is no programmatic way to check and format these files? If not, then there is a free / inexpensive option to offline all these files. I just run the program on a folder containing all JSON files and formatting it as needed?
It indicates that I was saving a file that was not in the JSN format in my working directory, from one place I was reading the data. Thanks for the helpful tips. invalid_json_files = [] read_json_files = [] def parse (): files in os.listdir (os.getcwd ()) For: as json_file with open (files): Try: Simplejson.load (json_file) read_json_files.append (files) ValueError, e: In addition to print ("JSON object issue:% s")% e invalid_json_files.append (Files) Print invalid_join_files, lane (read_json_files)
The built-in JSON module can be used as a validator:
import json def parse (text): try: return json.loads (text) valueError as e: print ('invalid jason:% s'% e) returns none #: or increase < / Code>
You can work with this file:
with open (file name) f: return json.load (f)
In place ofjson.loads
and you can include the filename in the error message.On Python 3.3.5,
{test: "Foo"}
, I get:Invalid Jason: enclosed in double quotation Expected property name: Line 1 column 2 (four 1)
and at 2.7.6:
Invalid Jason: expected asset name: line 1 column 2 (four 1)
This is correct Jason
{"test": "foo"}
.On handling invalid files, it is best not to process them further. You can create a skip.txt file that lists inaccurate files, so they can be checked and hand-fixed can be done.
If possible, you should check the site / program that generates invalid JSO files, OK and then re-generated the JSON file otherwise, you are new JSON file.
Not so, you have to write a custom Jasoners parser that fixes common errors. With it, you should keep the original under source control (or archived) so you can see that to fix the automatic equipment (as a discretionary inquiry) the difference can be seen. Unclear cases should be decided by hand.
Comments
Post a Comment