Hi, Hans.

My complicated C-programmer mind says to take each PDF, process it in binary form and spit out a new PDF with all timestamp digits changed to zeroes, then use the postprocessed files for comparison. That’s probably not a practical approach as it stands, but it might spark an idea that could work.

Victor
_________

On Feb 1, 2017, at 10:31:10, der.hans <PLUGd@LuftHans.com> wrote:

Am 01. Feb, 2017 schwätzte Joseph Sinclair so:

moin moin,

Have you checked DiffPDF?
It's supposed to do what you're looking for, although it's no-longer actively maintained (author took it closed-source :( ).

Not familiar with it. Ah, comparepdf for the command line version.

comparepdf -ca web1.pdf web2.pdf

Will have to test with it to verify some basic tolerances.

Another alternative might be to use pdf2ps (part of ghostscript) to
transform into postscript and compare that, but you may need to do
more massaging as timestamps and such would probably still be in the
postscript.

I tried that. I expected the datestamps to carry over, but they appear to
not have. The files are still different, though :(.

I was thinking pdf2png or pdf2jpg type of thing might work. Have to check
if I always get the same output.

ciao,

der.hans

On 02/01/2017 01:29 AM, der.hans wrote:
moin moin,

I have some dynamically generated PDFs coming from a pool of web servers.

Each server should be generating a PDF that looks exactly the same as from
all the other servers.

The PDF generation includes sticking in a few timestamps and possibly some
hostnames or other dynamic content. The dynamic content eliminates the
option of just using checksums to verify the output file is the same from
all of the web servers.

Any suggestions on how I can write a command line check. Needing to
install a script would be far less than ideal in this situation. Funnily
enough, needing to install a package would be less of an issue in this
particular case, especially something in CentOS 6.

Me being me, I did try to just grep out the lines with timestamps :). That
didn't quite work :(. That probably indicates the files aren't as exactly
the same as I hope.

I didn't see a pdf2sanity tool. pdf2text won't really work as I need to
verify the graphic content and hopefully the PDF wrapper.

ciao,

der.hans



--
#  http://www.LuftHans.com/        http://www.PhxLinux.org/
#  "Wasted day. Wasted life. Dessert, please."  -- Steven Meretzky---------------------------------------------------
PLUG-discuss mailing list - PLUG-discuss@lists.phxlinux.org
To subscribe, unsubscribe, or to change your mail settings:
http://lists.phxlinux.org/mailman/listinfo/plug-discuss