Last Updated: February 25, 2016
·
3.6K
· Gabriel Falcão

raising UnicodeEncodeError and UnicodeDecodeError manually (for testing purposes)

When writing python unit tests it might be useful to synthesize the exceptions UnicodeEncodeError and UnicodeDecodeError.

Although those exceptions are special functions and take a few special arguments that are not found easily in the python documentation (other than C-API documentation)

Handy documentation for UnicodeEncodeError

arguments:

  • encoding name (bytestring, i.e b"utf-8") can be anything.
  • subject object (bytestring, i.eb"") it can actually be blank, makes no difference
  • start of the byte (int) the index with the position of the beginning of the supposedly unrecocnized byte
  • end of the byte (int) the index with the position of the end of the supposedly unrecocnized byte
  • exception message (bytestring. i.e: "oops it's buggy") can be anything

example usage:

UnicodeDecodeError('hitchhiker', b"", 42, 43, 'the universe and everything else')

Handy documentation for UnicodeEncodeError

Very similar to the UnicodeDecodeError but the subject object must
be an unicode object

arguments:

  • encoding name (bytestring, i.e b"utf-8") can be anything.
  • subject object (unicode string, i.eb"") it can actually be blank, makes no difference
  • start of the byte (int) the index with the position of the beginning of the supposedly unrecocnized byte
  • end of the byte (int) the index with the position of the end of the supposedly unrecocnized byte
  • exception message (bytestring. i.e: "oops it's buggy") can be anything

example usage:

UnicodeEncodeError('hitchhiker', u"", 42, 43, 'the universe and everything else')