Mer komplett extraktion av metadata i bilder än mini-koden i Nyhetsanalys: Sunt förnuft när det gäller bildanalysen. Ev. kollisioner mellan variabler djupare i trädet hanteras inte. Vetskap om dem behöver ändå hanteras och ev. kollisioner kan hänteras när de uppstår. Viss redunans mellan de tre moduler som används finns d.v.s. för skräp indexering på begränsad hårdvara går det bra att optimera en del.
Exempel utskriften för en bild från Reuters kan dessutom för den intresserade läsaren ge ett kompletterande besläktat men enklare exempel för metod diskuterad kort i Snowden-filerna: Att detektera manipulerad information. Ex. avseende preferenser person eller organisatoriskt bias där vissa program, bildstorlek m.m. är mer eller mindre normalt (i sig eller givet värde för annat metadata).
Exempel: Utskrift metadata
För många (men inte alla) fält hittas information tillsammans med de moduler som används färdiga och hittas på search.cpan.org. Annat metadata som kan förekomma är varierat legacy mer eller mindre riktigt mot hur tänkt att vara och ibland med egen formatering i datafälten.
Bits Per Sample 8 Color Components 3 Comment CREATOR: gd-jpeg v1.0 (using IJG JPEG v62), quality = 95 Current IPTC Digest bf21543e5c98c2174bac65abbe29c7ca Directory test Encoding Process Baseline DCT, Huffman coding ExifTool Version Number 9.27 File Access Date/Time 2013:11:27 10:32:53+01:00 File Creation Date/Time 2013:11:27 10:33:31+01:00 File Modification Date/Time 2013:11:27 10:33:45+01:00 File Name a1.jpg File Permissions rw-rw-rw- File Size 41 kB File Type JPEG Image Height 215 Image Size 380x215 Image Width 380 JFIF Version 1.01 MIME Type image/jpeg Resolution Unit None SamplesPerPixel 3 X Resolution 1 Y Cb Cr Sub Sampling YCbCr4:2:0 (2 2) Y Resolution 1 by-line DENIS BALIBOUSE caption/abstract European Union foreign policy chief Catherine Ashton (3rd L) delivers a statement during a ceremony next to British Foreign Secretary William Hague, Germany's Foreign Minister Guido Westerwelle, Iranian Foreign Minister Mohammad Javad Zarif, Chinese Foreign Minister Wang Yi, U.S. Secretary of State John Kerry, Russia's Foreign Minister Sergei Lavrov and French Foreign Minister Laurent Fabius (L-R) at the United Nations in Geneva November 24, 2013. Iran and six world powers reached a breakthrough agreement early on Sunday to curb Tehran's nuclear programme in exchange for limited sanctions relief, in a first step towards resolving a dangerous decade-old standoff. REUTERS/Denis Balibouse (SWITZERLAND - Tags: POLITICS ENERGY TPX IMAGES OF THE DAY) category I city GENEVA color_type YCbCr country/primary location code CHE country/primary location name Switzerland credit REUTERS date created 20131124 edit status CORRECTION file_ext jpg file_media_type image/jpeg fixture identifier GM1E9BO0W9W01 headline European Union foreign policy chief Catherine Ashton delivers a statement during a ceremony at the United Nations in Geneva height 215 image type 3S keywords :rel:d:bm:GF2E9BO09X801 language identifier en object name IRAN-NUCLEAR-DEAL/ original transmission reference DBA01 originating program JPEGTOII2/MED program version 1.0.0.16 source X90072 supplemental category DIP POL ENR tpx time created 053640+0000 urgency 4 width 380 writer/editor DBA/KR
Kod
Perl.
use FileHandle; use Image::Info qw(image_info dim); use Image::EXIF; use Image::ExifTool qw(:Public); use Image::IPTCInfo; my $debug = 1; my %metadata_image; &run_it("_RULE_bRITANNIA", "test/" . "a1.jpg"); sub run_it() { my $session_id = $_[0]; my $file = $_[1]; if ( length($session_id) < 3 ) { die; } my $fp = FileHandle -> new($file); if ( ! $fp ) { die; } $fp -> close(); &sense__image__init_session($session_id); #................................. &sense__image__iptc($session_id,$file); &sense__image__elif_tags($session_id,$file); &sense__image__image_info($session_id,$file); if ( $debug ) { &power_print(); } #................................. &sense__image__end_session(); } sub sense__image__init_session() { undef %metadata_image; return $_[0]; } sub sense__image__end_session() { undef %metadata_image; return 1; } sub sense__image__iptc() { # Legacy i värden för datafält. my $file_name = $_[1]; my $info = new Image::IPTCInfo($file_name); my %db = %{$info}; if ( ! %db ) { return 0; } my @keys = keys %db; my $i = 0; my $dirty = 0; while ( $i < @keys ) { if ( ! ( ref ( $db{$keys[$i]} ) eq "HASH" ) ) { goto abc; } my @gg = keys %{$db{$keys[$i]}}; my $k = 0; while ( $k < @gg ) { my @ww; if ( ref ( $db{$keys[$i]} -> {$gg[$k]} ) eq "ARRAY" ) { @ww = @{$db{$keys[$i]} -> {$gg[$k]}}; } else { my $value = $db{$keys[$i]} -> {$gg[$k]}; $ww[0] = $db{$keys[$i]} -> {$gg[$k]}; } my $cc = 0; while ( $cc < @ww ) { my $value = $ww[$cc]; if ( length($value) > 0 ) { # Kolliderar meta-data: stopp-fält eller hantera annat :-D $metadata_image{$gg[$k]} -> {$value}++; $dirty = 1; } $cc++; } $k++; } abc: $i++; } return $dirty; } sub sense__image__image_info() { my $file_name = $_[1]; my %info = %{image_info($file_name)}; my @keys = keys %info; my $i = 0; my $dirty = 0; while ( $i < @keys ) { if ( ( $keys[$i] eq "color_type" ) || ( $keys[$i] eq "file_media_type" ) || ( $keys[$i] eq "file_ext" ) || ( $keys[$i] eq "width" ) || ( $keys[$i] eq "height" ) || ( $keys[$i] eq "SamplesPerPixel" ) || ( $keys[$i] eq "Interlace" ) || ( $keys[$i] eq "Compression" ) || ( $keys[$i] eq "Gamma" ) || ( $keys[$i] eq "LastModificationTime" ) ) { if ( length($info{$keys[$i]}) > 0 ) { $metadata_image{$keys[$i]} -> {$info{$keys[$i]}}++; $dirty = 1; } } $i++; } return $dirty; } sub sense__image__elif_tags() { my $file_name = $_[1]; # Re-used dokumentations-texten ungefär... my $exifTool = new Image::ExifTool; $exifTool->Options(Unknown => 1); my $info = $exifTool->ImageInfo($file_name); my $group = ''; my $tag = ''; my $c1 = 0; my $dirty = 0; foreach $tag ($exifTool->GetFoundTags('Group0')) { if ($group ne $exifTool->GetGroup($tag)) { $group = $exifTool->GetGroup($tag); } my $val = $info->{$tag}; if (ref $val eq 'SCALAR') { if ($$val =~ /^Binary data/) { $val = "($$val)"; } else { my $len = length($$val); $val = "(Binary data $len bytes)"; } } # Antingen värdet eller förklaring av det om ej. my $value = $exifTool->GetDescription($tag); if ( ( ! ( index($val,"Bad IPTC data") != -1 ) ) && ( length($tag) > 0 ) ) { $metadata_image{$value} -> {$val}++; $dirty = 1; } if ( $c1 > 200 ) { goto safety; } $c1++; } safety: return $dirty; } sub power_print() { my $out = FileHandle -> new("debug.tmp","w"); my @gg = sort keys %metadata_image; my $i = 0; print @gg . "\n"; while ( $i < @gg ) { my @hh = sort keys %{$metadata_image{$gg[$i]}}; my $k = 0; while ( $k < @hh ) { print $out $gg[$i] . "\t" . $hh[$k] . "\n"; $k++; } print $out "\n"; $i++; } $out -> close(); }