Wednesday, June 12, 2013

Annoying differences in how browsers handle Flash movement

Everyone hates the differences between the different major browsers out there. Not to mention the different versions of those browsers. Supporting all of them is a pain in the ass. And here is just one more reason:

Action: Take a Flash Object element on a page, and then move it someplace else (append the element to a different parent node)
Code:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
var newCont = document.createElement("div");
var swfOb = document.getElementById("swf-object-element"); //already exists on the page
print(swfOb.lastChild); //<param name="flashvars" value="name=value" />
swfOb.lastChild.value = "name=value2";
document.body.appendChild(newCont);
newCont.appendChild(swfOb);
swfOb = document.getElementById("swf-object-element");
var swfExternalFunction = e.setCorner;
e.setCorner("tr"); //(attempts to) move the corner to the top right (starts in the bottom right)
//In NO browser, does that call actually WORK
//...wait 100ms
swfExternalFunction == null //false IE,FF;  true Chrome
swfExternalFunction == e.setCorner //false!! all
e.setCorner("tr") //Error! IE;  works in FF,Chrome
swfOb.lastChild //<param name="flashvars" value="name=value" />  FF!! ; <param name="flashvars" value="name=value2" />  Chrome, IE (sometimes)

Consequences:
SWF is in the bottom right. The call to "setCorner" does nothing. All browsers. 

Because::
IE: Will not let you call a Flash External Interface function on a moved element. BUT!!!! the function will still exist on the element. Buggars. And I couldn't get a straight answer from IE when modifying the Object Element before/after the location change. IE seems to have some sort of race condition going on. Sometimes it does, sometimes it doesn't, and sometimes the buggar doesn't even allow ANY External Interface any more. More hate for IE. Tested in IE 10, _but_ in IE8 standards mode. I had similar problems in actual IE8 - I could not get consistant results. I need to create better tests for IE, but that's time I don't have.

FF: FF re-inits the SWF, just like Chrome. Any changes to the "old" element will NOT be reflected in the new one. Whats WORSE is that the reference to the element's External Interface function STILL EXIST (on the Object) while the "new" object is being initialized, BUT!!! those functions STILL REFERENCE THE "OLD" OBJECT!! Talk about craziness... 
But wait, I'm not done. Firefox completely IGNORES changes to the object element in the mean time. So you can't, say, change the FlashVars and have them be updated in the "new" Object. Booooooo FF

Chrome: The best of the three. I'd still prefer that the SWF not be re-initialized, but oh well. At LEAST Chrome doesn't lie to you and expose functions that are out of date or don't exist. As well, it recognizes changes to the Object Element and on the "new" Object, those changes are present.



RESULT:

The only sure fire way to move a SWF element is to just set a timeout for after the move, and re-init it then, knowing that all your previous changes are gone. Which means you have to keep state externally, or get the current state from the Object before you move it. There's NO WAY to check to verify that it's "ok yet" to make changes after you've moved it, so you just have to set a long timeout and pray that it's long enough (iff you don't want to make browser specific code, that is).

Thursday, May 23, 2013

How to stop cookies from being dropped on first party page

Originall written May 11, 2012


Goal - To block all non-first party cookies loading on a page.
Starting point - a <script> element loading my script into the first party page somewhere
Tools Used - Chrome Browser
What I know + test results:
DOM parsing is done synchronously and linearly - DOM Nodes are created as the page is "parsed" (ie. in order).
For example, for a script in the header, the "body" element does not exist yet.
Scripts are executed when their DOM Node is created and added to the Document. 
Iframes are loaded when the elements have been added to the Document.
Images are loaded when the NODE IS CREATED, not when added to the Document.
Image nodes are created before the page is parsed - a level one parse if you will, or just a pre-scan. Either way you view it, images are loaded immediately and independently of DOM parsing.
It's possible to replace nodes "under" the current node by setting the "innerHTML" of the parent node. The lower node (removed by your "innerHTML" text is never run (scripts might be loaded, but they wont be run)
Possible Methods:
Method 1: Halt page loading while allowing my own script to run and load an iframe (which in turn loads and runs), and once finished, continue page loading. Can remove elements according to preference.
Method 2: Focus on ?Iframes : replace them with placeholder till user allows the Iframe
Using the rules we know listed up top, we can halt the page from loading whenever we insert our script. Question remains - if we load our iframe will that be allowed to "play" while the parent page is "paused" waiting for our script to allow it to continue. Likely answer is yes. Didn't test.
Method 2 would just replace current iframes (and listen for new ones added later) and wait for a signal from the user to add them back. Method 2 script would have to be placed at the top of the "body" to be effective - it IS location dependent, as other (non-embedded) scripts in front of it would delay it's execution - allowing iframes time to load. External scripts or styles above our script in the body would unacceptably delay execution of our script.
Method 2 blocks only Iframes. Method 1 blocks Iframes + scripts. 
Nothing can block images. Which means the answer to this is a big - "cant do it".

Conclusion:

Possible to block all third party cookies : NO
Possible to block advertising cookies : YES (because can block the ad frame)
Possible to block beacons : NO (unless in Iframes or loaded by script)
Possible to remove other scripts from the page : YES (not explained here, but possible, as long as they are located after our script)
Possible to remove/replace Iframes : YES

Apache CouchDB tests


I was looking into a solution for both log entries and also some simple file hosting solution which would not require logging into the server (http interface). Long ago I looked into distributed DB's, and one of the ones I read about was CouchDB. This is not a "truly" distributed db, but it does have automatic sharding, which I'm still not really sure what that means (jk) but it's apparently important. The one thing that made this DB stand out was it's HTTP API. Meaning, you can run a web site with only the DB - no Tomcat/Nginx/whatever required. This appealed to me, as a concept of simplicity, if nothing else. I assumed that combining your server with your db would be faster than having them separate. So I looked into CouchDB in depth, and this is what I found:

Objects are JSON docs or attachments to those docs (attachments can be anything, generally an image file).
Objects can be served directly as-is, or through a "view" or a "list or a "show".
Direct document acces returns JSON objects of the format {"id": . . . ,"rev": . . . ,"doc": ..the actual doc..}. No flexibility if you don't like that.
"Views" are typically results. In other words, in a sql db, it would be the result set of a query. A view is cached and updated as needed, automatically.
"Shows" are ways to reference single documents and format the response in a custom manner. 
"List" is a way to custom format, combine docs, apply templates, etc - basically a "Show" for a "view" instead of a single doc.
Documents can have "binary" files - images, raw JS, etc. This would have to be used for "fast" script serveing where the script is run-able in a browser.

After some tests, it appears that:
"Shows" are semi-fast. Views are faster since I think they are served direct from the cache.
"Lists" are not too fast, they cache the Views which make the list, but they have to JSON.parse the docs, and then stringify them again for EVERY query. This takes time - not disk access time, but process time, I believe. But because this is actually three steps: get the view, get the JSON object, process the objects, there are three places to bottleneck and apparently they DO bottleneck. Response times vary by an order of magnitude.
"Views" are very fast. It appears it's all cached and ready to serve. Too slow to be mem-cached, but I think that's because I use "include_docs" which makes the DB do lookups for each included doc.
 Didn't test binary serve, but it's fast too.

Performance Tests (LOCAL): getting 12kb
1000 queries, using 5 connections    (in the time in parentheses, the connections was upped to 20)
  • "show" = 1300ms  (1180ms)
  • "list" = 23 seconds. Fastest single request: 14ms, Slowest single request: 220ms  (same)
  • "view" = 450ms  (430ms)
  • direct access = 450ms  (430ms)
  • direct acces as attachment (binary) = 380ms  (same)
  • TOMCAT simple file serve = 450ms   (190ms)
  • ICON SERVER from memory = 200-280ms  (same)
Number of connections are sort of a grey area for me - I'm not sure of the technical ability of my OS or Tomcat or CouchDB to actually process or open N number of connections. So I assume by the numbers, which in Tomcat say that opening 2 connections is no slower than opening 20, that I don't have the whole picture. Or maybe I do - Tomcat FILE serving was greatly increased with more connections - so maybe the Icon server Tomcat instance is just already at a bottleneck somewhere, so adding more connections just doesn't do anything. Opening more connections showed negligible improvement in CouchDB, so obviously either there is bottleneck elsewhere or there's just something completely unknown to me going on. However, opening fewer connections to CouchDB got vastly increased times. What this means is that given the strange result of the ICON SERVER response times being the same for 2 connections as for 20, that result will have to be thrown out. Since that is the ONLY result that is meaningful to me, it's a big hit to the value of this test.
But, in my tests of the live icon servers, I find that serving simple files to ALWAYS be slower than serving from memory, so we can use the simple file serve time to represent (an upper boundary of) the real icon server time. Any way you swing it, the icon server will be twice as fast as a CouchDB. This is to be expected though, as no normal DB will be as fast as a information cached in memory. However, there is vast differences for best case vs worse case.
So, since local tests aren't going to work for me, lets try elsewhere.
Tests QA Amazon Server: getting 12kb for the Couch DB, 9kb for the I.S.
(Icon Server from memory cache)
100 queries, 1 connection:
  • I.S: 18.5s
  • CDB: 17.6s
1000 queries, 10 connections
  • I.S.: 21.5s
  • CDB: 27s
1000 queries, 30 connections
  • I.S.: 13.7s
  • CDB: 22.8s
These results reflect more definitively what the tests on the local machine say - at a high number of connections, the performance of the CouchDB decreases until such a point that increasing connections reduces performance. The Icon Server is getting a smaller file size, 25% smaller than the CouchDB, so the numbers are a little off, but we can see the overall trend. Adding a second or two on the IS times doesn't really change the result.

Summary

This is not a high performance DB. This is a feature specific DB. Would be great for repos or small projects.
These results are actually very positive for CouchDB in certain circumstances. It's half the speed of the current Icon Servers - but the icon servers are speed demons and half the speed of them is actually very good, for a non-memory based DB. CouchDB clusters work like GIT repos - there is no master or slave. They work together to update each other and guarantee eventually consistency. This means that ANY one DB can be updated and the update will spread across the cluster automatically, and every DB is an exact duplicate of the others. If one fails, you loose nothing. If all but one fails, you still loose nothing.
The BEST thing about CouchDB is it's independence from a server. Via it's HTTP API. Development time on Couch DB, **if it suffices for the purpose**, can be DRASTICALLY reduced as compared to the normal JAVA EE stack.

What CouchDB is great for:

Dynamic content. Transactions, or dynamically bound content. Low-medium bandwidth. Version control - it would great for a GIT repo type service, CHAT, transcripts, documents, cloud drive.

What CouchDB would be bad for:

Logging DB. High-bandwidth. Static content.

I'd recommend using this DB for independent (standalone) projects, proof of concept or "testing the waters" projects, or in the cases of the above paragraph "great for" section. In particular to my reason for testing this, I wanted a system for dynamically binding content when it's served. Specifically, I wanted to apply a system of imports and modularization of Javascript files. What I found is that the only way to do this (practically) would be to serve a generic bootloader JS from the Icon Server and then have it make secondary requests to this DB, OR apply a secondary system to the Couch DB server itself - I would have to dedicate a DB connection to a listener which, when updates are applied to key files (modules), would update the resultant files (the served JS). This is exactly what I was attempting to avoid by using CouchDB - extra effort using an extra system.

Notes and Rants about Firefox and Chrome Addon building

Originally from April 16, 2012

When I first started building an Addon for Firefox, I thought it was really cool to have access to the inner workings of the browser via their extensive list of API's. I was not used to such openness by an application - usually even the most open application will limit developers in some very important and annoying ways. As a Flash/Actionscript developer, I can rant several days about how annoying a few but very important limitation in Flash are and how many weeks of my life I've devoted to getting around them. This was not the case with FF and I appreciated it. 

But after a few months, I started to get a real picture of how things really worked - every API was different, separate, and likely built by different people. This meant that every time I had to use a new API, there was a learning curve. Then months later debugging, I would have to go back and remember how the h3ll I did that. MDN (Mozilla development reference pages) became my new bible and indispensable too to understanding what would otherwise never makes sense to me. Now a few parts of the way FF extensions work was interesting to me - I really like how modules work and I appreciate the design, as well their DB API was fairly convenient and easy to use, but most curiously, is how they allow development in both C and JS and how the two relate to each other (FF was initially C only but they soon built a layer on top of that to allow for JS coding - early code examples show both a C and JS version, which allows you to glimpse at how they must have translated between the two) . 

Building this all in JS, back when the only alternative was IE and  C/C++, must have made (non-c) developers weep with joy. But unfortunately there are too many artifacts of that early transition that just aren't necessary in JS, but that they keep in place for backwards compatibility? easy understanding by long term FF programmers? simply "that's how we've always done it"?? At least these annoying quirks in FF are well documented, if still confusing and time consuming to implement. Of course that's not including the time it takes to understand each API and more importantly, what the data received from actually means / sent to it actually does ( I found an annoying lack of the details of this type of information on MDN - they explain well how to code for it, but they often neglect the details about the specifics of what the API actually does).

So after a year of my time (9hrs a day, 5 days a week: felt like a lifetime) devoted fully to Firefox development, 3 full revisions of the same plugin and 10000 changes, I, and my employer, now have a stable, well working Addon that I can be proud of. Another major reworking of it is in development, which I can say, in design, is much simpler and has -many- fewer points of possible failure (ie. more elegant). But in the middle of that transition, I was tasked with creating a brother for this Addon in Chrome and IE, though the IE brother is going to be fairly retarded. Thus the process of porting to Chrome begins. 

I knew a little about Addons in Chrome from a year ago when I made a simple history keeping Addon (which only kept the history for domains which I specified, and on those domains scanned each page for certain key words, logging instances when those words appeared). The structure of Chrome is completely different and far less powerful than Firefox - Firefox uses Gecko, which is a full blown application platform, whereas Chrome just provides a few API's. Initially, I was less than wowed. In fact, when visiting the Google campus, I stood outside the Chrome building, and in a voice as loud as I dared, told them to extend their API's because they "sucked" haha. And in scope, yes they do, compared with Firefox. But I came to learn that for what they lacked in scale, they made up for big time in at least having a unified design and a much simpler interface than FF. And fortunately, a HUGE feature of the Addon which I make, the ability to hear and intercept outgoing and incoming requests - was being implemented as I spoke (and boy, is it a million times simpler than the FF equivalent).

Chrome has inherit limitations on what features it can /should provide based upon how Addons are implemented. In Chrome, Addons are simply invisible html pages that are allowed to interact with other html pages the user has open. At the same time, they have a few special privileges in regards to loading information from the internet, as well, Chrome has a few convenient API's by which it can pass information back and forth from the Addon. First major design point - All communication in Chrome is done via JSON. Now, they are nice enough to do the translation for you, automatically calling JSON.parse and JSON.stringify when necessary, making it just a tad simpler for developers. This message passing has the same limitations as normal JSON services which populate the internet. But that also makes this very comfortable/familiar to web developers.  _  _ _ _  (( There are of course a few exceptions to this rule in recent API's, most notably the WebRequest API, which is the most recent API added to Chrome (and thank a deity too, because it's exactly what I needed for my addon). I don't mean that it's not JSON, I mean that it's not typical message passing. For those that don't know, "message passing" is an actual thing, not a noun+verb, and it is asynchronous, which as a feature has the single most significant impact on the design of an Addon for Chrome. And how the WebRequest API is different is that it acts synchronously to an addon's input. I doubt that Chrome developers suddenly decided to have a truly synchronous API, so I imagine it's simply designed to wait at certain times for the addon's response, but it still acts in sequence with external events, which makes it different from other APIs. )) _ _ _ _ _  So basically, everything is asynchronous, and everything is passed around as a JSON object. There are of course a few other quirks, but nothing else so important. Nothing is actually integrated with the browser or has control over any browser function/feature.

This is the exact opposite of Firefox. Firefox actually loads an Addon's code into the browser itself. Using "overlays", something like a CSS which "cascades" the different levels of code into a single combined source. This is then run as Firefox itself (actually, FF has two layers, the chrome and the gecko/xul runner backend ... addon code is added to the chrome). Once combined, an addon has full access to the chrome layer of Firefox and can pretty much do what it wants to that layer. As well, it's possible to create "modules" which are actually run as separate entities,  and even components which are attached to the Gecko backend (not integrated, just run on). The current reworking of the FF addon which I mentioned before, is actually transitioning from a mostly chrome layer based addon, to a purely module based form (aside from necessary UI). All functionality from the backend is done via API's, which as I mentioned before, are each different and occasionally very annoying to deal with. The lack of a single purpose, form, or style amongst these API's, means that not only are you stuck putting the pieces of the puzzle together, you also have to get your razor blade out and cut the pieces to fit each other. You would not -believe- the trouble involved in figuring out if a recently set cookie was a third party cookie or not!! You have to use three different API's and implement a component to do it correctly (certainly you can make some "guesses" and make this much simpler), and the component is terribly confusing because there are tons of different possibilities  no guarantees, and no documentation about the different cases you might see. Oh, don't forget the case where the user has multiple windows open!! Just shoot me now and get the suffering over with! Anyways, Firefox's addon capabilities are truly powerful, and truly scary at times. At least there are synchronous events (observations) and DB calls.

Then the time came to port the code of our mature FF plugin to Chrome. As I mentioned before, I had a negative opinion of Chrome's addon abilities due to it's lack of API's and limitations on functionality. But for my purposes, with Chrome's WebRequest API, it seemed to provide for the functionality I needed and thus would make a fully functioning Chrome addon possible. I'll not write my notes on this port here, but to make my point, I'll just mention that I found myself deleting significant portions of the FF addon source code. In some cases, a file of 300+ lines of code was reduced to 20 lines with Chrome with equivalent functionality. This was across the board - making the code easier to read and understand by those who are not me. Chrome actually seems to provide their API's with intended uses in mind, not just tossing some random information in the air and seeing what happens (FF). For example, just telling me what tab makes a http request is as simple as request.tab in Chrome - but in FF, I have over 80 lines of code to figure out the same thing (simple cases are about 15 lines, but it's the edge cases that really piss me off) and its STILL not guaranteed to find an answer. I'm just gonna give a "WTF?" on that one. 

Though my main concern of finding out what web page set a third party cookie, is about equally difficult in Chrome - as just like FF, the only information you start out with is a domain name of the cookie - the rest you have to figure out yourself. I'm gonna give a "WTF??" to both Chrome and FF on that one. I expect as much from FF, but I'm surprised that Google didn't think that's important information.

Nasty Tomcat memory leak messages with JAXRS

So my JAX-RS service kept on giving me warnings of memory leaks. Sometimes, many, depending on how long my services was running, it seemed. I was a bit concerned. But then I found a nice little helpful snippet on the web:

http://java.net/jira/browse/JAXB-831

And while still annoyed, I feel alot better knowing that I can ignore the message (assuming of course the information in the ticket is correct - I did not bother to validate it myself)