Data Compression : Searching for Patterns
a compression program doesn't have any concept of separate words : it only
looks for patterns.
Pattern : combinaison of character that are repeated over the sentence
From simple pattern ( "ou" in "your" and "country"
) to more than one word ( "can do for you" )
The ability to rewrite the dictionary is the "adaptive" part of
LZ adaptive dictionary-based algorithm.
The way a program actually does this is fairly complicated,
as you can see by the discussions on Data-Compression.com.
No matter what specific method you use, this in-depth searching system lets
you compress the file much more efficiently than you could by just picking out
words.
Using the patterns we picked out above, and adding "_" for spaces,
we come up with this larger dictionary:
- ask_
- what_
- you
- r_country
- _can_do_for_you
"1not_2345;_12354"
Sentence 16 units + dictionary 40 units = 56 units!
Source: http://computer.howstuffworks.com/file-compression1.htm