add tag
sogu
**DO**

- I am scraping JSON data from a website.
- Website is built for that so no issues by the scraping.
- I use ```encoding = 'latin-1' ``` because that seems to be working the best with the text. With ```encoding = 'utf-8' ``` I had tons of errors (long ago).
- Everything is fine until a Key has chines names as values in it. It did not happen before just now for a couple of days.


**PROBALE SOLUTIONS**
- A.) Switch encoding something that uses most languages globally (defiantly Chines).
- B.) Forcing just the Chines characters to encode differently than encode them to 'latin-1'


**INPUT**

```
import requests
from datetime import date, datetime
from time import sleep
import re
# and import some other local libraries


mydata = {}
jsoned = 1
page = 0
while jsoned:
    r =  requests.get('http://thewebsite-i-scrape-for-data.com/api.php?request=all&page='+str(page),timeout=60)
    '''!!! 1st just print out the error -> 2nd than handle the error'''
    try:
        page += 1
        r.encoding = 'latin-1'
        jsoned = r.json()
        mydata = {**mydata,**jsoned}
        sleep(60)
        #pass # decide how to handle a server that's misbehaving to this extent
    except:
        print(page, r.json())
        pass
```


**PRINTOUTS WITH THE JSON**

```
...
'1231234': {'id': 1231234,
  'name_of_user': '鳥鳥 Something Else With Latin Characters here just after the Chinese name',
  'a': 'Some Latin Text',
  'b': 'Some Latin Text',
  'c': '',
  'd': 523,
  'e': 325,
  'f': 0,
  'g': '500,000 .. 1,000,000',
  'h': 0,
  'i': 0,
  'j': 0,
  'k': 0,
  'l': '11',
  'm': '11',
  'n': '0',
  'o': 12},
...
```



**ERROR Without the Try Except Pass**

```
JSONDecodeError: Expecting value: line 1 column 1 (char 0)
```

**ERROR WITH  UTF-8**

```
---------------------------------------------------------------------------
JSONDecodeError                           Traceback (most recent call last)
<ipython-input-12-5ca2ab47bc7a> in <module>
     21         r.encoding = 'utf-8' #'latin-1'
---> 22         jsoned = r.json()
     23         mydata = {**mydata,**jsoned}

/usr/lib/python3/dist-packages/requests/models.py in json(self, **kwargs)
    896                     pass
--> 897         return complexjson.loads(self.text, **kwargs)
    898 

/usr/lib/python3/dist-packages/simplejson/__init__.py in loads(s, encoding, cls, object_hook, parse_float, parse_int, parse_constant, object_pairs_hook, use_decimal, **kw)
    517             and not use_decimal and not kw):
--> 518         return _default_decoder.decode(s)
    519     if cls is None:

/usr/lib/python3/dist-packages/simplejson/decoder.py in decode(self, s, _w, _PY3)
    369             s = str(s, self.encoding)
--> 370         obj, end = self.raw_decode(s)
    371         end = _w(s, end).end()

/usr/lib/python3/dist-packages/simplejson/decoder.py in raw_decode(self, s, idx, _w, _PY3)
    399                 idx += 3
--> 400         return self.scan_once(s, idx=_w(s, idx).end())

JSONDecodeError: Expecting value: line 1 column 1 (char 0)

During handling of the above exception, another exception occurred:

JSONDecodeError                           Traceback (most recent call last)
<ipython-input-12-5ca2ab47bc7a> in <module>
     25         #pass # decide how to handle a server that's misbehaving to this extent
     26     except: # (JSONDecodeError, RuntimeError, TypeError, NameError):
---> 27         print(page, r.json())
     28         pass
     29 

/usr/lib/python3/dist-packages/requests/models.py in json(self, **kwargs)
    895                     # used.
    896                     pass
--> 897         return complexjson.loads(self.text, **kwargs)
    898 
    899     @property

/usr/lib/python3/dist-packages/simplejson/__init__.py in loads(s, encoding, cls, object_hook, parse_float, parse_int, parse_constant, object_pairs_hook, use_decimal, **kw)
    516             parse_constant is None and object_pairs_hook is None
    517             and not use_decimal and not kw):
--> 518         return _default_decoder.decode(s)
    519     if cls is None:
    520         cls = JSONDecoder

/usr/lib/python3/dist-packages/simplejson/decoder.py in decode(self, s, _w, _PY3)
    368         if _PY3 and isinstance(s, bytes):
    369             s = str(s, self.encoding)
--> 370         obj, end = self.raw_decode(s)
    371         end = _w(s, end).end()
    372         if end != len(s):

/usr/lib/python3/dist-packages/simplejson/decoder.py in raw_decode(self, s, idx, _w, _PY3)
    398             elif ord0 == 0xef and s[idx:idx + 3] == '\xef\xbb\xbf':
    399                 idx += 3
--> 400         return self.scan_once(s, idx=_w(s, idx).end())

JSONDecodeError: Expecting value: line 1 column 1 (char 0)
```



**TRIED SOLUTIONS**

- https://stackoverflow.com/questions/63831916/scrapy-python-web-scraping-json
- https://stackoverflow.com/questions/33294213/how-to-decode-unicode-in-a-chinese-text
- https://stackoverflow.com/questions/16573332/jsondecodeerror-expecting-value-line-1-column-1-char-0
- https://github.com/psf/requests/issues/4908
- https://stackoverflow.com/questions/34930301/character-encoding-from-chinese-to-latin1-in-python
- https://pypi.org/project/jieba/
- https://stackoverflow.com/questions/34587346/python-check-if-a-string-contains-chinese-character/34587637
- https://stackoverflow.com/questions/30817137/when-python-loads-json-how-to-convert-str-to-unicode-so-i-can-print-chinese-cha
- https://stackoverflow.com/questions/18337407/saving-utf-8-texts-with-json-dumps-as-utf8-not-as-u-escape-sequence
- https://stackoverflow.com/questions/44203397/python-requests-get-returns-improperly-decoded-text-instead-of-utf-8
- https://stackoverflow.com/questions/44203397/python-requests-get-returns-improperly-decoded-text-instead-of-utf-8
- https://stackoverflow.com/questions/49702214/python-requests-response-encoded-in-utf-8-but-cannot-be-decoded
- https://stackoverflow.com/questions/45111047/encoding-error-in-python-api-response
- https://stackoverflow.com/questions/12309269/how-do-i-write-json-data-to-a-file

Enter question or answer id or url (and optionally further answer ids/urls from the same question) from

Separate each id/url with a space. No need to list your own answers; they will be imported automatically.