javascript - Pass PDF document to user-agent via XHR, using modern (HTML5) methods, without compromising document encoding -


i trying accomplish described @ handle file download ajax post , dynamically-generated pdf file, i'm generating php v5.6.10 (using third-party pdflib extension, v9.0.5).

yet, when pdf file downloaded, corrupted; no pdf-reading implementation i've tried able read file, , every observation points fact file content being butchered somewhere between printing pdf content response body , saving file via user-agent (web-browser) javascript.

i happen using jquery v2.1.4, i'm not sure matters, ultimately.

important provisos

i should mention that, other asker (cited above), have html form users fill-out , submit via post verb. form submission performed javascript, because there 5 forms displayed in tabbed layout submitted simultaneously, , validation errors must sent via ajax , displayed without refreshing entire page). mention make clear fact post request, may return either a) json object (that contains validation error strings, primarily), or b) string represents pdf document, should presented user-agent file download.

my code

the javascript

$('#submit-button').click(function() {     $.ajax({         url: $(this).data('action'),         type: 'post',         data: $($(this).data('forms')).serialize(),         processdata: false,         statuscode: {             500: function() {                 alert('an internal server error occurred. go pound sand.');             }         }     }).done(function(data, status, xhr) {         processresponse(data, status, xhr);     }).fail(function(jqxhr, textstatus) {         if (textstatus === 'timeout') {             alert('the request timed-out. please try again.');         }     }); });  function processresponse(response, status, xhr) {     if (response !== null && typeof response === 'object') {         //the server return either json string (if input invalid)         //or pdf file. land here in former case.     }     else {         //this doesn't change behavior.         xhr.responsetype = 'blob';          //this doesn't change behavior, either.         //xhr.overridemimetype('text\/plain; charset=x-user-defined');          //the remainder of function taken verbatim from:         //https://stackoverflow.com/a/23797348          // check filename         var filename = "";         var disposition = xhr.getresponseheader('content-disposition');         if (disposition && disposition.indexof('attachment') !== -1) {             var filenameregex = /filename[^;=\n]*=((['"]).*?\2|[^;\n]*)/;             var matches = filenameregex.exec(disposition);             if (matches != null && matches[1]) filename = matches[1].replace(/['"]/g, '');         }          var type = xhr.getresponseheader('content-type');          //is logged console "application/pdf".         console.log(type);          var blob = new blob([response], { type: type });          if (typeof window.navigator.mssaveblob !== 'undefined') {             // ie workaround "html7007: 1 or more blob urls revoked closing blob created. these urls no longer resolve data backing url has been freed."             window.navigator.mssaveblob(blob, filename);         } else {             var url = window.url || window.webkiturl;             var downloadurl = url.createobjecturl(blob);              //is logged console url() (it's object, not string).             console.log(url);              //is logged console "blob:https://example.com/108eb066-645c-4859-a4d2-6f7a42f4f369"             console.log(downloadurl);              //is logged console "pdftest.pdf".             console.log(filename);              if (filename) {                 // use html5 a[download] attribute specify filename                 var = document.createelement("a");                 // safari doesn't support yet                 if (typeof a.download === 'undefined') {                     window.location = downloadurl;                 } else {                     a.href = downloadurl;                     a.download = filename;                     document.body.appendchild(a);                     a.click();                 }             } else {                 window.location = downloadurl;             }              settimeout(function () { url.revokeobjecturl(downloadurl); }, 100); // cleanup         }     } } 

the php

<?php  use file; use \pdflib;  class pdf {  protected $p; protected $bufferedcontent;  public function __construct() {     $this->p = new pdflib();      $this->p->set_option('errorpolicy=return');     $this->p->set_option('textformat=utf8');     $this->p->set_option('escapesequence=true'); }  //...  public function sendtobrowser() {     $this->bufferedcontent = $this->p->get_buffer();      header_remove();      header('content-type: application/pdf');     header('content-length: ' . strlen($this->bufferedcontent));     header('content-disposition: attachment; filename=pdftest.pdf');      $byteswritten = file::put(realpath(__dir__ . '/../../public/assets/pdfs') . '/' . uniqid() . '.pdf', $this->bufferedcontent);      echo $this->bufferedcontent;     exit; }  //...  } 

notice in php method writing pdf file disk prior sending in response body. added bit determine whether pdf file written disk corrupted, too, , not; opens in every reader i've tried.

observations , theories

what find strange i've tried download in 3 different browsers (the recent versions of chrome, firefox, , ie 11) , pdf size drastically different each browser. following file sizes each:

  1. written disk (not corrupted): 105kb
  2. chrome: 193kb
  3. firefox: 188kb
  4. ie 11: 141kb

at point, convinced problem relates encoding used within pdf. discovered discrepancy when using winmerge compare copy of pdf dump directly disk before returning http response copy handled via ajax.

the first clue error message, appears when attempt compare 2 pdf documents:

winmerge error: information lost due encoding errors

i click ok dismiss error, , comparison resumes.

winmerge comparison of pdf documents. corrupted pdf file on left-hand side, , unadulterated pdf on right-hand side. notice variance in document encoding, highlighted in yellow, @ bottom.

the functional/correct pdf (at right, in winmerge) encoded using windows-1252 (cp1252); assume that encoding happens within pdflib (despite running on gnu/linux system). 1 can see php snippet, above, calling $this->p->set_option('textformat=utf8'); explicitly, seems set encoding input text included in pdf document (and not document encoding).

ultimately, left wondering if there means pdf displayed correctly after download.

change pdf encoding, instead?

i wonder if there "good reason" pdflib using windows-1252 encoding generate pdf document. there chance simple changing encoding on pdflib side match jquery's ajax implementation requires (utf-8)?

i've consulted pdflib manual more information, , there section dedicated subject: 4.2 unicode-capable language bindings. section has 2 subsections: 4.2.1 language bindings native unicode strings (php not among them) , 4.2.2 language bindings utf-8 support (php falls category). discussed herein seems pertain actual strings inserted pdf body, , not overall document encoding.

then there 4.4 single-byte (8-bit) encodings, following note:

note information in section unlikely required in unicode workflows.

how 1 employ unicode workflow in context?

the manual available @ http://www.pdflib.com/fileadmin/pdflib/pdf/manuals/pdflib-9.0.5-tutorial.pdf feels may useful.

approaches i'd prefer avoid

i hesitate business of re-encoding pdf in javascript, client-side, once has been downloaded. if means achieve this, go direction.

initially, primary aim avoid approach leave abandoned pdf files laying-around on server, in temporary directory (thereby necessitating clean-up cron-job or similar), may viable option.

if necessary, implement interstitial step whereby write pdf file disk (on web-server), pass client using unsightly hidden-iframe hack, , delete file once user-agent receives it. of course, if user-agent never finishes download, user closes browser, etc., file abandoned , i'll left clean other means (the idea of hate, on principle).

any assistance hugely appreciated.

you tried on iframe?

i have same problem, resolve iframe. ugly code works me.

solution iframe


Comments

Popular posts from this blog

How to show in django cms breadcrumbs full path? -

php - Invalid Cofiguration - yii\base\InvalidConfigException - Yii2 -

ruby on rails - npm error: tunneling socket could not be established, cause=connect ETIMEDOUT -