add tag
sogu
**DO**

- I am scraping JSON data from a website.
- Website is built for that so no issues by the scraping.
- I use ```encoding = 'latin-1' ``` because that seems to be working the best with the text.
- Everything is fine until a Key has chines names as values in it. It did not happen before just now for a couple of days.


**PROBALE SOLUTIONS**
- A.) Switch encoding something that uses most languages globally (defiantly Chines).
- B.) Forcing just the Chines characters to encode differently than encode them to 'latin-1'


**INPUT**

```
import requests
from datetime import date, datetime
from time import sleep
import re
# and import some other local libraries


mydata = {}
jsoned = 1
page = 0
while jsoned:
    r =  requests.get('http://thewebsite-i-scrape-for-data.com/api.php?request=all&page='+str(page),timeout=60)
    '''!!! 1st just print out the error -> 2nd than handle the error'''
    try:
        page += 1
        r.encoding = 'latin-1'
        jsoned = r.json()
        mydata = {**mydata,**jsoned}
        sleep(60)
        #pass # decide how to handle a server that's misbehaving to this extent
    except:
        print(page, r.json())
        pass
```


**PRINTOUTS**

```
{'123123': {'id': 4214152,
  'name': 'CHINESE CAHARACTERS' ....
```



**ERROR Without the Try Except Pass**

```
JSONDecodeError: Expecting value: line 1 column 1 (char 0)
```

**TRIED SOLUTIONS**

- https://stackoverflow.com/questions/63831916/scrapy-python-web-scraping-json
- https://stackoverflow.com/questions/33294213/how-to-decode-unicode-in-a-chinese-text
- https://stackoverflow.com/questions/16573332/jsondecodeerror-expecting-value-line-1-column-1-char-0
- https://github.com/psf/requests/issues/4908
- https://stackoverflow.com/questions/34930301/character-encoding-from-chinese-to-latin1-in-python
- https://pypi.org/project/jieba/
- https://stackoverflow.com/questions/34587346/python-check-if-a-string-contains-chinese-character/34587637
- https://stackoverflow.com/questions/30817137/when-python-loads-json-how-to-convert-str-to-unicode-so-i-can-print-chinese-cha
- https://stackoverflow.com/questions/18337407/saving-utf-8-texts-with-json-dumps-as-utf8-not-as-u-escape-sequence
- https://stackoverflow.com/questions/44203397/python-requests-get-returns-improperly-decoded-text-instead-of-utf-8
- https://stackoverflow.com/questions/44203397/python-requests-get-returns-improperly-decoded-text-instead-of-utf-8
- https://stackoverflow.com/questions/49702214/python-requests-response-encoded-in-utf-8-but-cannot-be-decoded
- https://stackoverflow.com/questions/45111047/encoding-error-in-python-api-response
- https://stackoverflow.com/questions/12309269/how-do-i-write-json-data-to-a-file

This room is for discussion about this question.

Once logged in you can direct comments to any contributor here.

Enter question or answer id or url (and optionally further answer ids/urls from the same question) from

Separate each id/url with a space. No need to list your own answers; they will be imported automatically.