Ambrose Li wrote:
Aude Sylvain wrote:
This may be a bit OT (if so I am sorry) but would anyone know why RTF files are so heavy compared to the same file in the .doc/.docx format?
The fundamental difference is that RTF files are essentially just text files with a lot of markup; the tags take up lots of space so the files are always large. DOCX files, on the other hand, are really ZIP files with a standardized structure. (They are actual ZIP files that can be opened by unzip programs.) Their text component is still a text file with markup (XML markup instead of RTF markup), but unless there’s a lot of binary data (e.g., images), text compresses well so DOCX files will be smaller.
(Even when there’s a lot of binary data, binary data in DOCX files are really represented in binary form so they will still take up less space than their counterpart in RTF files.)
DOC files are binary files and by the virtue of their simply being binary they are smaller than RTF which is a non-binary format.
Thank you Ambrose!
[Edited at 2011-02-24 20:17 GMT]