Project Description
C# class that determines if given file or stream is encoded in UTF8

Utf8Checker is very light managed checker intented to determine of a file ot stream in encoded in UTF8.

Reference
http://www.unicode.org/versions/corrigendum1.html

Current implementation focus on correctness and not on perforamnce.
Therefe the whole provided file/stream is scanned and validated.
For scenarios where performance is more important and it can modified to scan only certain portion in the beginning, like 256 bytes.

Provided are tests including two comprehensive utf8 encoded tests files.

Test data sources:
http://www.cl.cam.ac.uk/~mgk25/ucs/examples/UTF-8-demo.txt
http://www.columbia.edu/kermit/utf8.html
http://www.cl.cam.ac.uk/~mgk25/ucs/examples/UTF-8-test.txt

Last edited Feb 6, 2010 at 4:14 PM by devdimi, version 6