View Full Version : Puzzling file size winwintoo 10-03-2003, 06:45 PM I'm working on my presentation to the computer group and came across some interesting information.
I want to give them an overview of file size, resolution etc.
BUT - just when I thought I understood it........
I have this image, it's saved for web, 48 kb on my machine. I put it on a web site, went to the web site, right-clicked and copied it to the clipboard and then immediately pasted it into PhotoShop Elements. Photoshop Elements says the image is now 287 kb
What is happening?? Can anyone explain??
Thanks, Margaret Doug Nelson 10-03-2003, 08:05 PM Photoshop files are a lot larger than JPG files, not even counting the compression offered by JPG. JPG saves space by more than just compression. It also strips out a lot of other information. Plus, Photoshop seems to always inflate the file size of open files. catia 10-03-2003, 08:32 PM Perhaps the same size gremlins that take a 200K file and "convert" it to 287K when I attach it to a Yahoo email. Then we don't want to even talk about AOL.
:)
Catia Doug Nelson 10-03-2003, 08:38 PM Actually, it's very similar. JPG files save size by storing the command "8 black pixels" instead of storing 8 actual black pixels (as an example, the actual mechanism is much more sophisticated than this). Plus, they apply LZW compression after that (sort of like a ZIP file, but a different algorythm). Uncompressing these files intails uncompressing the LZW, then actually painting in each pixel. At this point it would be similar to a TIF file with no compression, or a BMP file (of which the AOL ART file is a version). Similar, but not interchangeable, since (again) this is a gross oversimplification. Photoshop also adds its own information, plus there's room for other information such as EXIF data, etc. catia 10-03-2003, 08:56 PM Thanks Doug,
That was very informative. I guess a pixel is not a pixel is not a pixel. :(
Or something like that. :D
Catia Doug Nelson 10-03-2003, 09:15 PM While I'm on a roll I might as well go on and explain why it's not good to save a JPG file as another JPG file. As I mentioned, JPG will look at repeating patterns as a source of compression, but you can control how intently it looks for patterns using the "quality" or "amount" setting. At high compression levels it will see, for example, 8 black pixels, then 1 white pixel, then another 8 pixels and say essentially "screw it, that 1 white pixel isn't that important" and record the data as just 17 black pixels. Then the next time it will record those 17 black pixels as a single command and look around for other black pixels and decide how important the information inbetween them is. This is called "lossy compression", and naturally it does it for every color, not just black. So, just hitting "save" on a JPG file can actually destroy data, as it does this analysis each time the file is saved. Gradually it will toss more and more data.
On an interesting sidenote (I love throwing tidbits like this in), if you happen to know the precise compression setting the JPG had when it was originally saved, and replicate that exact setting, virtually no data will be lost when re-saving it as a JPG. That's more in the realm of interesting trivia than actual workflow advice, however. winwintoo 10-03-2003, 09:36 PM Doug, this presentation is on October 15 from 1 - 3:30 pm at the Seniors Education Center just down the street. If you can be there to explain file sizes, I'll buy you lunch :D :D :D
Thank you so much for the information. I never even thought that interpreting the "8 black pixels" line into actual black pixels would make the size different.
Margaret catia 10-03-2003, 09:41 PM Hmmm,
I thought Jpeg was based on 8 pixel by 8 pixel block encoding. Each 8x8 block is cosine transformed ( this is a linear transformation that packs all of the energy into the low spatial frequency terms). Then bits are assigned to the terms with the most energy. So, high spatial frequency stuff gets a small number of bits. To be able to recontruct the image, one needs the "bitmap." The cosine coefficients and the bit map are "packed" into the file. Now, that was the "source" coding part. The fun begins with the "channel" coding. We have to add back in enough redundancy to insure that we can recover the bitmap and the cosine transform coefficients. That is where the variability arises. By the way, the DC term of each transform block (the zero spatial frequency term) is basically the average intensity of that block. Any error in this term produces the dreaded blocking error.
Hmmmmm. Sorry, I guess I am getting carried away. Information theory was my minor and every once in a while it slips out. :D :D
Catia Doug Nelson 10-03-2003, 09:59 PM Like I said, "gross oversimplification" :) catia 10-03-2003, 10:07 PM You are a sweetheart Doug and we all love you. We really do. Keep up the good work.
:)
Catia | |