Google Pagespeed Insights

Scoring 100 on Mobile and Desktop in Google Pagespeed Insights can be a daunting task. Once you use a JS framework you’re pretty much screwed because of the “above-the-fold” render blocking thing.

But is it really important to score 100? In which cases should you worry about the score? There’s heaps of articles suggesting “stop worrying about the pagespeed insights score”. Well, I agree that it might not be the most important factor for the user experience. But on the other hand, I don’t agree that it’s not important to Google. A Google employee told me during an AdWords campaign (set up by Google themselves), that our Mobile page speed was low and that this was penalized by the Search Engine. And how did he measure it? With Google Page Speed Insights, of course. See, the point is, if Google Page Speed Insights is good or bad is not important. It matters to Google, so it matters for your SEO.

It’s a bit hard to design an experiment to actually show the impact of an increased page speed insights score. What I could think of was this: Write an article and publish it twice, once with a bad page speed score, and once with a good one. So here’s the article, it’s about Angular Universal:

and here’s the article again, just loading slower since I load five unnecessary  versions of Jquery in the head:

The first article scores 99/100 (mobile) and 100/100 (desktop) and the second article 61/100 (mobile) and 75/100 (desktop). Let’s see what happens, I’ll post the result in a few weeks here.

By the way, the article is also related to this topic, since Angular Universal would be the answer to page speed problems. Well, you know, if it’d actually work (read the article).

Typescript Mongo Express Angular.io Node (MEAN) Boilerplate

UPDATE: A cool project has developed out of my initial Typescript-MEAN seed!

Check it out on Github!

I was a bit shocked when I searched for Typescript MEAN (Mongo-Express-Angular.io-Node) tutorials and boilerplate code (with Angular.io I mean Angular2 or above). There is some material, but it’s pretty outdated. Of course it’s a bit unfortunate that you have to count 2015 as outdated in 2017, but that’s how it is in the frontend world. So I set out to build my own boilerplate code for MEAN (Mongo-Express-Angular.io-Node) apps that use TYPESCRIPT for many apps to come. I also wanted the boilerplate to include unit tests, so this is not something you’ll have to add later on, but can start with right away.

So basically the requirements that I had for this boilerplate were:

  • 100% typescript. Furthermore, the backend and the frontend should have shared code (e.g. data models).
  • Full coverage with unit tests
  • Support simple REST calls of the form api/v1/:resourceName/:resourceId

It turned out, that for the frontend part I didn’t have to do anything. This is because Angular2 comes with angular-cli, an always up-to-date build & scaffolding tool for your Angular apps. So in order to build the Angular part, I simply had to refer to Angular Cli. But for the backend, the Mongo-Express-Node part, a lot of setup had to be done.

In order to being able to develop independently on the frontend and the backend, e.g. if you have backend devs and frontend devs on your team, I setup the basic structure of the boilerplate like this:

typescript-mongo-express-angular-node-seed
├── .git
├── backend/
│ ├── .git
│ ├── db/
│ ├── ...
│ └── package.json
└── (frontend: just use angular cli)

So there are the two repos on the same level: backend, frontend. Additionally you could add a third repo with shared code between frontend and backend. First I have put the data models into a shared folder and published this to npm (so they could be reused from frontend and backend), but it turned out to be quite annoying since they had to be changed too often and my editor was lacking autocompletion for this workflow. Anyways, the backend and frontend  are completely separated. By completely separated I mean, the backend has it’s own git-repo. I can send my backend dev just the link to the backend repo.

The real work lies in the backend. There, we need to setup express, an app configuration, connect to mongo, build unit tests etc. The backend structure now looks like so:

typescript-mongo-express-node-backend
├── properties (gitignored)
├── dist (gitignored)
├── src/
│ ├── api/
│ ├── auth/
│ ├── config/
│ ├── db/
│ ├── logger/
│ ├── test/
│ └── index.ts
├── package.json
├── README.md
└── tsconfig.json

There are different layers, all including unit tests written with chai and mocha. So what I’d recommend you to do at this point is to just go and checkout the seed / boilerplate. You can find it at:

The README will have further instructions how to install, run & test the code. I hope this boilerplate helps you to get you started with typescript & the MEAN stack. In the following, I also append a short “Why the MEAN stack”, because knowing the why is even more important than knowing the how, but I only append it because I guess most came here because of the “how”, since the why is already well covered in other articles.

Why the MEAN stack?

Developer experience

As opposed to backend and frontend in different languages, you’ll just need to write Javascript / Typescript most of the time. This makes you a bit faster as a full-stack dev, since you don’t have to switch context that much and just need to be fluent in one language. Also, if you’ve implemented some routine in the backend, but you decide it would be better to run it in the frontend (or vice-versa), it’s much easier to migrate the code.

On the downside, with a backend in Node & Mongo, also a lot of problems can arise. Node uses the non-blocking asynchronous nature of javascript (more on that in the next section), which requires for a lot of callbacks. This makes programming harder and less linear. For example if you have java-devs in your team, they might have a hard time adjusting to this and mutter “wtf, that’s retarded” quite a few times.

Non-Blocking I/O

Node runs on a single thread and all it’s calls, for example to the mongodb, are asynchronous. This means node can have a good performance when you have thousands of requests per second. Whether you really have that or will ever have that is not mine to judge.

Conclusion

The MEAN (Mongo-Express-Angular2-Node) Stack can be the stack of choice for frontend devs going full stack. However, there doesn’t seem to be a lot of good, up-to-date and typed (typescript) boilerplate code out there for the backend part. So I set out to create & maintain this boilerplate code. Here you can find the current seed. To get the latest news check it on Twitter. Additionally, there now is a small tutorial page how to get it up & running from scratch: tsmean.com !

Jsfiddle vs Codepen vs Plunker vs JSBin for Embedding

JSBin

Jsbin needs a pro account for many features (such as embedding), so I’d refrain from it, since the others are just as good or better.

CodePen

It’s got a beautiful design, but some things are stupid about it. For example embedding the code into an other page.  Apparently it can be found in the export menu:

But what I get from this menu is this:

So I guess it’s a pro feature? Anyways, they could at least show those items, but gray them out and let me know why they’re not available.

JSFiddle

Also some bugs, for example if I accidentally put some css in the html panel, I can’t remove it anymore because the error notifications block my view:

Apart from this, the view seemed to load a bit faster than with plunker and is a better structured with the js / html / css / result tabs in case you have something simple to demonstrate

(okay, the scrollbar wouldn’t need to be there, but that’s a detail)

Here’s an actual example:

Plunker

Since plunker supports many files, it needs to be laid out a bit differently than jsfiddle. Here’s what it looks like:

It’s also quite neat, and I had the best experiences with editing in plunker so far. Also, with the plunker preview I found that it shows the result first, while the jsfiddle shows the js/html first. Here I’d say it depends on your use case what you prefer.

Conclusion

I’d recommend JSFiddle for simple things or plunker for entire apps. However, if you have many, many frames to embed, none of the options might be ideal. As you can see here https://www.bersling.com/2017/03/22/flexbox-tutorial/ it might slow down quite a bit if there are too many frames.

How to write a library for Angular 2 and publish it to NPM

Using modular code is a good idea, and nothing is more modular than writing a library. Like this you can just publish your library to npm and then use it in all your projects. This can be useful if you have a component you want to reuse or share,  a service or just any piece of code that you think can be reused between projects. But with typescript and Angular 2 writing your own library can be a bit daunting at first. Do you publish the typescript or the compiled javascript? Where does the result go, in a dist folder? What are best practices? What should I declare as dependencies of my library in my package.json?

Fortunately, you are not the first one to have come across those questions. Unfortunately, as things are changing rapidly, there are a lot of out of date answers when you Google for this problem. And even more so is there confusion as it seems to be immensely complicated to get a working Angular2 library. So this article aims at giving an up-to-date answer on how to create your ng2lib.

The currently best way to scaffold your Angular 2 library

So without further ado, this way is the best I’ve found so far to scaffold your Angular 2 library:

I think this is currently the best approach, since everything else simply will make you pull out your hair.

I’ve last checked if this is still the best solution in  June 2017.

Can I use angular-cli to create a standalone library?

As of March 2017, the angular-cli, which is used to create Angular projects and components, services etc for your project doesn’t offer a clear way to create a standalone library.


On a sidenote, if you just need to create a typescript library, no angular involved, you can checkout how-to-write-a-typescript-library.com!

Contenteditable vs Input

Contenteditable and Input can be quite similar. Both you can click, then you can change the text inside. So where’s the difference and when should you use which one?

Contenteditable

With contenteditable, you can modify a html snippet. Usually the html on a website is “view-only” meaning you can’t just click somewhere and then edit the html. But as soon as you add the contenteditable attribute to a html-tag, all of the inner html is editable by clicking on it and hitting keys on the keyboard. So for example in:

<body contenteditable=true>
  <h1>Contenteditable</h1>
  <div style="border: 1px solid red; padding: 5px">
    Everything here is editable...
  </div>
</body>

you can edit all the html inside, which would yield:

When is this useful?

  • If you don’t want to separate the editor-view from the display-view. See example below.
  • If you want a multiline input that’s wrappable (see example below).
  • If you want to let a user mess around with the html, but don’t want to provide a wysiwyg.

This example illustrates a usecase of a contenteditable, where it can achieve something that’s not possible with an input. In this case the contenteditable is the bold title. See how it wraps around the buttons? Good luck achieving that with an input.

Many words of caution

However, you also should be cautious with contenteditable as there are a few pitfalls.

Take care if you want to store the edited string to a database and it should be a simple string, not html.

In this case you should either resort to the input field or take special precautions. In case you want still want to head down this road, here are some some things to bear in mind:

  • Different browsers will handle contenteditable differently. For example, Firefox sometimes adds a <br> tag to the edited content, whereas chrome doesn’t.
  • The user could copy/paste some html into the contenteditable area. For example try to select and copy this into the contenteditable above.

Due to those reasons, you should definitely process the html before storing it into your database. One possibility to process / sanitize the html would be:

var sanitized = htmlElt.textContent || htmlElt.innerText;

This will ensure you only get the text of the content.

What about security?

Well, you can try it for yourself: Copy/Paste this

<script>alert('hi')</script>

into the contenteditable above. As you can see, it’s escaped properly.

Input

The input field on the other hand has its use in the following cases:

  • Forms
  • One-line editing of a string, overflow is hidden.
  • One-line editing of a number.
  • Restricted input.

Conclusion

It’s not always easy to choose which one to use. The basic rule of thumb would be

  • Contenteditable to edit html
  • Input to edit strings or numbers

You should only break this rule if you have very special needs, as in the usecase above, where the contenteditable was used to minimize the difference between editor/input and display and wrap the content around the buttons.

AWS (Amazon Web Services) is definitely the right choice for your production environment

When your doing your own blog or your own webpage, you can easily choose some simple setup like the ones provided by DigitalOcean. However, when it comes to a scalable setup, AWS is simply unbeatable. Why you may ask? Well, because it offers everything your heart desires. This may be daunting at first, but in the long run it’s definitely easier to manage everything on AWS. And by everything, I mean everything. Most people only think about servers “and a bunch of confusing stuff” when they think about AWS, but once you start looking at it for a bit, you’ll start enjoying the other stuff just as much.

Registrar: Route 53

You won’t need GoDaddy or Gandi or any other registrar anymore. There’s Route 53 for that. So you can also handle your DNS with AWS!

Certificates (SSL): ACM

Obtaining SSL certificates used to be expensive. Nowadays it’s free with things like letsencrypt, but it can be even easier: With ACM (Amazon Certificate Manager) you have created certificates within minutes, ready to attach to anything on AWS just with clicking.

Servers (obviously): EC2

The beginners choice for servers are EC2 Instances. Just spin them up, configure with some settings (security groups etc.) and do whatever you like with them. There isn’t much benefit here over other serverfarms, except that it’s a bit cheaper, more flexible and integrates well with the other AWS services. Next an example for integration.

Load Balancer (also EC2)

From the first server on, it’s a smart idea to put the server behind a load balancer. This ensures that you can easily attach and detach servers on load changes, as opposed to registering your DNS to just one server. In a production environment, this kind of creeps me out. E.g. what happens when you want to reboot the server for updating the kernel? No chance without a load balancer.

File Storage: S3

You could store your files in your db, but it’s also possible and perhaps even more flexible to store them in S3 buckets. You can even build your entire db-system on the S3 storage system, e.g. by storing JSON documents into the S3. S3 has versioning, so you could even build on this!

Security

AWS doesn’t leave security up to chance. Their security is world-leading in all aspects

Availability

AWS has a ridiculously high availability, so you don’t even need to worry that much anymore about accidental data loss by server breakage or something like that. You could furthermore use versioned S3 buckets to give you a backup history

Accessibility

With IAM Users and security groups, AWS has a logical and easy to use interface to manage security to your instances and to your AWS controls. You can even enable MFA (Multi-Factor Authentication) to make sure your precious production environment maybe worth millions isn’t just accessible by a leaked password.

Conclusion

That’s just the tip of the iceberg and what we’re using at the moment, but just for about every need you might encounter with your Web-Setup, AWS has the perfect solution for you.

39181104-iceberg-wallpapers

Migrating from GoDaddy to AWS Route 53

TL;DR: 1) Be careful that the nameservers are up everywhere before you shut down the old system (wait 48h) 2) Check if your MX records are correct (mail still working?)

Migrating Nameservers and DNS is always tricky. It’s not instant, so it could look fine on your computer but be completely broken somewhere else. So how can you do it safely? Here’s a few easy steps to follow to minimise your risks during the migration, illustrated for the case of

GoDaddy => AWS Route 53

This means, we’ll assume you currently have a Production-App registered with GoDaddy, but you want to migrate to AWS Route 53, e.g. because you already have your servers there. We’ll also assume our domain name is “examples.com” (because with example.com I couldn’t do all the steps).

0) Dummy setup

Depending on how important it is to you that everything runs 100% smoothly, you might first want to do the entire process with a dummy domain. You’d spend 12$ on any domain your heart desires, set up some DNS, and then do all the following steps and see if all runs smoothly. This is a very time-consuming process, so I’d only recommend if it’s the end of the world if something goes wrong during the process.

1) Setup the system on the target (AWS)

This step you can always do without impacting anything in production. Make sure you can access your system directly with it’s IP and that it’s running smoothly.

SSL

If you’re running with a AWS load balancer, you almost have to setup SSL with AWS Certificate Manager since it’s free and easy. But how can you check if it’s working or not? Since you’d like to check yourproductiondomain.com, which is still with the other registrar, it’s going to be hard to check.

HTTP VS HTTPS

Make sure that the page is accessible via http and https. On some systems you might get forwarded automatically to https, on others not. You can use http://downforeveryoneorjustme.com/ to check if a page is down on http. NOTE that the default AWS loadbalancer security settings don’t open port 80!!! You need to set that manually:Screen Shot 2017-01-29 at 18.31.36

2) Download the Zone Info from GoDaddy

Screen Shot 2017-01-27 at 14.02.47

There’s an option to download the zone information in GoDaddy.

3) Import the Zone Info into AWS Route 53

There’s function to import the zone file in AWS Route 53.

Screen Shot 2017-01-27 at 14.08.41

However,

BE EXTREMELY CAREFUL HERE

The import somehow messes up the MX entries! The MX entries in my zone file were:

@ 3600 IN MX 5 ALT1.ASPMX.L.GOOGLE.COM
@ 3600 IN MX 5 ALT2.ASPMX.L.GOOGLE.COM
@ 3600 IN MX 10 ALT3.ASPMX.L.GOOGLE.COM
@ 3600 IN MX 10 ALT4.ASPMX.L.GOOGLE

But AWS decided to import it as:

Screen Shot 2017-01-27 at 14.11.19

HOW NICE OF THEM, THEY ADD RANDOM STUFF AT THE END… Seriously guys, wtf?

You need to correct the MX records, in case you wish to receive your mails after the migration!!!

In case you don’t use an external provider (i.e. you’re using GoDaddy email), make sure you setup the new MX records first!

Anyways, check your entries entry by entry to make sure they are set correctly.

The only one that you might want to differ is the examples.com. root level entry because you might want to point it to an ALIAS of your AWS LoadBalancer. The loadbalancer doesn’t allow IP’s in the first place, so this might even be a necessary switch (and why you migrated away from GoDaddy?).

Anyways, now we’re entering the

flat,1000x1000,075,f

4) Set the new Name Servers in GoDaddy

The AWS Route 53 will tell you what the new name servers are:

Screen Shot 2017-01-27 at 14.38.16

Delete the old ones from GoDaddy and insert the new ones. But before this, ask yourself again:

  1. Is my new system (if any) up and running?
  2. Are the MX Records correct?
  3. Did I setup the SSL correctly?

After changing, the traffic will slowly starting to go through the new name servers.

BUT BE AWARE: Even though it goes to through the new name servers on your machine, doesn’t mean it goes through the new name servers everywhere!!!

The only way to make sure all traffic goes through the new name-servers, is to wait 48 hours.

So then you do this. Wait 48 hours. What you can do meanwhile is:

CHECK YOUR EMAILS. ARE THEY STILL WORKING?

Run for example an MX check here http://mxtoolbox.com/ to check if the E-Mails via the new DNS are working.

5) Do the transfer (48h later, or with the old system still running)

To do so, unlock your domain name first at GoDaddy. They’ll provide you with an authorization code to put into AWS route 53. Now all you have to do is request the transfer and accept it through the email they send you.

And that’s it. That’s how you migrate domains .

 

Which UI-Router version should I install?

I don’t know why, but UI-Router doesn’t seem to be very clear about their versions, maybe they are ashamed of the old ones. However, this is really confusing and might lead you to install the wrong one. Their links just redirect me in a circle without providing me with any real idea what to install. It’s really quite the achievement from people building a router to build the most confusing navigation EVER. Why is this? The UI Team had a successful project for Angular 1. It grew and grew and just as Angular 1 became a mess, so did UI-Router become a mess. So they rewrote the entire core in typescript and made it framework agnostic. Which in turn allowed them to provide implementations for Angular 1, Angular 2 and React. So today exist the following versions:

  • 0.x.x for Angular 1. Probably what you are using in an existing angular 1 project
  • RC1 for Angular 1, Angular 2 and React.

This means the confusion is especially for the Angular 1 users, as for React and Angular 2 users, there’s only one project (however badly documented and still RC 1).

ui-router (2)

Okay, so here’s what you really need, depending on your situation.

If you haven’t started your project yet or are early on

Then choose ui-router 1 for your project, either the Angular 2 flavor or React flavor (since I don’t assume you’ll use Angular 1). It’s in a late beta/ rc1 (release candidate 1) state as of January 2017.

For Angular 2:

npm install --save ui-router-ng2

The links you’ll need:

For React:

npm install --save ui-router-react

Links:

If you have an existing Angular 1 app

Now this is where it gets confusing. You’ve probably built your codebase on UI-Router 0.x.x (e.g. 0.3.2) which they call UI-Router Legacy now. They’ve rewritten the entire codebase, which is UI-Router 1 now, which is just about to become “stable” in 2017.

So if you want just to update your probably existing UI-Router, then use:

npm install --save angular-ui-router

which install ui-router 0.4.2 as of January 25th 2017. Typical things you’ll find in your 0.x.x installation are things like `$stateChangeSuccess`.

The links you’ll need for working with ui-router 0.x.x:

However, it doesn’t feel good to build more code on something that is now labeled UI-Router Legacy does it? You probably want to migrate away from it, but when is a good time?

If you’re already with Angular 1.5 or Angular 1.6 and use “component based architecture”

What’s really cool about UI-Router 1 is, that you can use a component as root for your view. So you can finally write EVERYTHING (except for services, of course) as a component! But be wary: A MIGRATION TO UI-ROUTER 1 MIGHT BE PAINFUL! They broke a lot of stuff and you should really factor in some time for a refactor!

How you can install it UI-Router 1:

npm install --save angular-ui-router@next

It’s in a RC (release candidate) state, and the way forward. You wouldn’t be in too much of a hurry for the migration, unless you’re introducing a lot of new views which you’d like to strap to components instead of the legacy views. However, don’t be hasty, as it’s RC 1, will bring a lot of breaking changes and is not really documented well yet. The links you’ll need for the migration are those:

Bonus: Can I install UI-Router with Bower?

It’s also considered legacy (god damn), but Angular 1 you can.

Installing “the latest legacy version” (0.x.x):

bower install angular-ui-router

Installing UI-Router 1:

angular-ui-router#1

For Angular 2 and React it’s not possible.

Don’t use UI-Router’s resolve

Let’s say you have an app with certain permissions. And let’s say the permissions can usually be determined by the URL. For example, when each URL encodes a resource and you’ll get from the server which permissions the current user has for this resource. Isn’t it then a great idea to use resolve in UI-router, so you don’t have to deal with those pesky promises anymore?

I mean what you’re doing with resolve:

  1. Get the permissions and wait app-wide until they are loaded
  2. Load the view. Then you can treat permissions synchronously like permissions.EDIT_RESOURCE
    instead of the cumbersome

    PermissionService.then(resp => {
      //do everything permission related here
    }

    which makes your app more nested, especially when you have multiple promises.

So why, when there are such great advantages, shouldn’t you use UI-Routers resolve?

The reason for this is, your app will feel very laggy if you start using the resolve excessively. Because before any state transition, the page will resolve the promise. Meaning until you get the response from the server, your app won’t do a thing. So for example you click a button to enter a resource, but absolutely nothing happens until the resolve’s are resolved. This feels very laggy and broken. Trust me, I speak from experience. The app will feel so much more responsive if you use the promises. You’ll have a bit more code, but usually it’s worth it, since parts of your app can be rendered much faster and the other calls don’t have to wait for the promise to resolve.

Mathjax Parser for HTML Strings

Mathjax-Parser

Mathjax is great for rendering math on the web. You can simply choose delimiters for inline- and display-math and then start inputing some html mixed with mathjax-delimited stuff, and mathjax will render it for you. For example:

<p>Here's some math: $\frac a b = c$</p>

(assuming $…$ is your inline-math-delimiter)

This could be great, for example if you have a page where users need to input mathematics that should be rendered nicely. Ok, now let’s say you want to store this html to your database. You could either just store <p>Here's some math: $\frac a b = c$</p> OR you could want to map it to some kind of internal format, for example an xml scheme, like so:

<paragraph> Here's some math: <math>\frac a b = c</math></paragraph>

(you might want to do this to have a proper and flexible internal data format for your database)

Then you’d need to parse the math. In this example you’d have to detect `$…$` and map it to <math>…</math>. Now the million dollar question is: How can I parse mathjax?! (and yes, regex is a bad idea). This article documents the thoughts behind a mathjax parser. For the final result check out the links at the bottom for the full and working code.

Parsing Mathjax from a HTML String

I’ll assume here a basic familiarity with the DOM and the concept of DOM-nodes. If you don’t have any clue what the DOM is, it’s going to be hard for you to follow. In any case, you could still get the resulting Mathjax-Parser, just scroll down and include the full parser into your codebase. For everyone interested in a deeper understanding of the parser, read on.

Now first of all, let’s examine the rules, when mathjax actually renders something and when not. From the docs:

There cannot be HTML tags within the math delimiters (other than <br>) as TeX-formatted math does not include HTML tags

So this means, that the example from above would get rendered, while <p> $ hello <strong> world </strong>$ </p> would NOT BE RENDERED by mathjax.

So what does this rule mean, in order for parsing mathjax? It means, we need to find (here it comes):

ALL ADJACENT TEXT AND BR NODES

An example:

<p> $Hello <br> world$, isn't it a sunny <strong> day <br> today? </strong> Cool beans. </p>

Here there are three sets of adjacent text-and-br nodes (red, green, blue). Those are the bits of the html that could contain mathjax. Now how do we find those nodes? If you’ve ever been in touch with a DOM-parser, you probably know that it’s quite easy to write a walker that iterates over all nodes. A simple dom walker would process the nodes above in the following order:

  1. <p>
  2. $Hello (text-node)
  3. <br>
  4. world$, isn’t it a sunny (text-node)
  5. <strong>
  6. day <br> today? (text-node)
  7. Cool beans. (text-node)

Now the problem is, that you can easily process individual nodes, but operating on a set of nodes is a bit harder. But we’ll have to find the successions of text-or-br nodes, since that’s where the mathjax could be! Without further a-don’t, here’s the code that does this (most of the code following is at least partially in typescript):

private findAdjacentTextOrBrNodes = (nodeList: NodeList): MyRange<number>[] => {
  //value true if node is textOrBr, false otherwise
  //example:
  // hello <br> world <span>bla</span>
  // would yield
  // [true, true, true, false]
  let textOrBrNodes: boolean[] = [];
  for (let i: number = 0; i < nodeList.length; i++) {
    let node: Node = nodeList[i];
    this.isTextOrBrNode(node) ? textOrBrNodes.push(true) : textOrBrNodes.push(false);
  }

  //get array with ranges (arrays) of adjacentTextOrBrNodes
  //example:
  // hello <br> world <span>bla</span> that's cool
  // would yield
  // [{start: 0, end: 3},{start: 4, end: 5}]
  let adjacentTextOrBrNodes: MyRange<number>[] = [];
  for (let i: number = 0; i < textOrBrNodes.length; i++) {
    let isTextOrBrNode: boolean = textOrBrNodes[i];

    if (isTextOrBrNode) {

      //handle case if IS NOT ADJACENT MATCH: insert new array
      if (adjacentTextOrBrNodes.length === 0 ||
          adjacentTextOrBrNodes[adjacentTextOrBrNodes.length - 1].end !== i
      ) {

        adjacentTextOrBrNodes.push({
          start: i,
          end: i+1
        });
      }
      //handle case if IS ADJACENT MATCH: raise value by one
      else if (adjacentTextOrBrNodes[adjacentTextOrBrNodes.length - 1].end === i) {
        ++adjacentTextOrBrNodes[adjacentTextOrBrNodes.length - 1].end;
      }

    }
  }
  return adjacentTextOrBrNodes;
};
interface MyRange<T> {
  start: T;
  end: T;
}

Here are the cornerstones explained:

  • findAdjacentTextOrBrNodes: our elaborate method name.
  • MyRange<number>[]: return type of the method. For the above example, this would be the expected outcome when running this method on the p node:
[{start: 0, end: 3}, {start: 4, end: 5}]
  • nodeList: NodeList: A node list that should INCLUDING TEXT NODES. More on that later.
  • The rest is basically just an iteration over all children producing the output described above. Again, the idea is to get a list of all adjacent text-or-br nodes.

Once we have this method, we can build on it. We can write a processor, again for a nodeList, that fetches the  AdjacentTextOrBrNodes, processes the mathjax, and then recursively does the same again for all child Nodes.

private processNodeList = (nodeList: NodeList) => {
  let allAdjacentTextOrBrNodes: MyRange<number>[] = this.findAdjacentTextOrBrNodes(nodeList);

  allAdjacentTextOrBrNodes.forEach((textOrBrNodeSet: MyRange<number>) => {
    //processMath
  });

  //process children
  for (let i: number = 0; i < nodeList.length; i++) {
    let node: Node = nodeList[i];
    this.processNodeList(node.childNodes);
  }

};

How exactly the math is processed would largely be a repetition of the implementation details of the parser and is thus omitted here. However, one thing to point out here: I called node.childNodes and not node.children since the latter does not contain text nodes! And those are very key to our endeavour to parse mathjax. Also for the initial bootstrapping of the processNodeList, childNodes is called like so:

//create a DOM element in order to use the DOM-Walker
let body: HTMLElement = document.createElement('body');
body.innerHTML = inputHtml;

this.processNodeList(body.childNodes);

The implementation details of processMath a bit complicated, the easiest thing at this point would be to just get the code (see below) and check it out. Here we’ll just discuss a few more points that were important for the building process.

“processEscapes”: true

When using the $ delimiters, mathjax provides a process escape option, such that \$ isn’t counted as the start of mathjax. In any case \$ is never counted as the end of mathjax in TeX mode, processEscape enabled or not. The mathjax parser has to handle this special case.

Mathjax Parsing Rules

Understanding how exactly the parser processes text-or-br nodes is crucial to write it correctly.  With the following settings

"inlineMath": [["\(","\)"],
 "displayMath": [["\[","\]"]]

let’s illustrate this with an example.

Example: This \[hello \(world\).

Rendered: This \[hello \(world\)

As you can see it’s not rendered at all. So what you’ll have to do in the parser is scan all text-or-br nodes “left to right“, and as soon as you find an opening delimiter, ignore everything else and try to find the closing delimiter.

This also means that in the following example, the math-inside-the-math will not be rendered:

Example: Hello \[ Wo \(rld\) \].

Rendered: Hello \[ Wo \(rld\) \].

Ordering the delimiters

For example, if you have the start-delimiters $$ for display-math, and $ for inline-math, you’ll have to scan for $$ first such that the parsing works correctly. More generally the rule goes like this for all delimiters:

If start-delimiter A contains start-delimiter Bthen you must first scan for A and only afterwards for B.

Final Result

Here are the final results:

Github: Mathjax-Parser

 

Directly install with NPM

npm install --save mathjax-parser

And a pic of what the parser could do, here transforming the delimiters of the inline math and block math.