feat: add remarklint for md docs (#213)

RalphJbeily · web-flow · commit 2e1e4d90c9de · 2019-01-24T11:09:18.000+02:00
* feat: add remarklint for md docs

* fix: remarkrc file and run linter on commit hook
diff --git a/.remarkrc b/.remarkrc
@@ -0,0 +1,6 @@
+{
+  "plugins": [
+    "remark-preset-lint-recommended",
+    ["remark-lint-list-item-indent", false]
+  ]
+}
diff --git a/package.json b/package.json
@@ -6,7 +6,7 @@
   "main": "./dist/mercury.js",
   "scripts": {
     "lint": "if test \"$CI\" != \"true\" ; then eslint . --fix; fi",
-    "lint:ci": "eslint .",
+    "lint:ci": "remark . && eslint .",
     "lint-fix-quiet": "eslint --fix --quiet",
     "build": "yarn lint && rollup -c && yarn test:build",
     "build:web": "yarn lint && rollup -c rollup.config.web.js && yarn test:build:web",
@@ -79,6 +79,9 @@
     "nock": "^10.0.6",
     "ora": "^3.0.0",
     "prettier": "^1.15.3",
+    "remark-cli": "^6.0.1",
+    "remark-lint": "^6.0.4",
+    "remark-preset-lint-recommended": "^3.0.2",
     "requirejs": "^2.3.6",
     "rollup": "^1.1.0",
     "rollup-plugin-babel": "^4.0.1",
@@ -125,6 +128,7 @@
       "git add"
     ],
     "*.{json,css,md}": [
+      "remark .",
       "prettier --write",
       "git add"
     ]
diff --git a/src/extractors/custom/README.md b/src/extractors/custom/README.md
@@ -8,18 +8,19 @@ Custom parsers allow you to write CSS selectors that will find the content you'r
 
 You can query for every field returned by the Mercury Parser:
 
-  - title
-  - author
-  - content
-  - date_published
-  - lead_image_url
-  - dek
-  - next_page_url
-  - excerpt
+- title
+- author
+- content
+- date_published
+- lead_image_url
+- dek
+- next_page_url
+- excerpt
 
 ### Using selectors
 
 #### Basic selectors
+
 To demonstrate, let's start with something simple: Your selector for the page's title might look something like this:
 
 ```javascript
@@ -41,12 +42,13 @@ As you might guess, the selectors key provides an array of selectors that Mercur
 The selector you choose should return one element. If more than one element is returned by your selector, it will fail (and Mercury will fall back to its generic extractor).
 
 #### Selecting an attribute
-Sometimes the information you want to return lives in an element's attribute rather than its text — e.g., sometimes a more exact ISO-formatted date/time will be stored in an attribute of an element. 
+
+Sometimes the information you want to return lives in an element's attribute rather than its text — e.g., sometimes a more exact ISO-formatted date/time will be stored in an attribute of an element.
 
 So your element looks like this:
 
 ```html
-    <time class="article-timestamp" datetime="2016-09-02T07:30:01-04:00">
+<time class="article-timestamp" datetime="2016-09-02T07:30:01-04:00"></time>
 ```
 
 The text you want isn't the text inside a matching element, but rather, inside the datetime attribute. To write a selector that returns an attribute, you provide your custom parser with a two-element array. The first element is your selector; the second element is the attribute you'd like to return.
@@ -71,7 +73,7 @@ This is all you'll need to know to handle most of the fields Mercury parses (tit
 
 An article's content can be more complex than the other fields, meaning you sometimes need to do more than just provide the selector(s) in order to return clean content.
 
-For example, sometimes an article's content will contain related content that doesn't translate or render well when you just want to see the article's content. The clean key allows you to provide an array of selectors identifying elements that should be removed from the content. 
+For example, sometimes an article's content will contain related content that doesn't translate or render well when you just want to see the article's content. The clean key allows you to provide an array of selectors identifying elements that should be removed from the content.
 
 Here's an example:
 
@@ -195,21 +197,21 @@ If you look at your parser's test file, you'll see a few instructions to guide y
 By default, the first test, which ensures your custom extractor is being selected properly, should be passing. The first failing test checks to see whether your extractor returns the correct title:
 
 ```javascript
-  it('returns the title', (async) () => {
-    // To pass this test, fill out the title selector
-    // in ./src/extractors/custom/www.newyorker.com/index.js.
-    const html =
-      fs.readFileSync('./fixtures/www.newyorker.com/1475245895852.html');
-    const articleUrl =
-      'http://www.newyorker.com/tech/elements/hacking-cryptography-and-the-countdown-to-quantum-computing';
-
-    const { title } =
-      await Mercury.parse(articleUrl, html, { fallback: false });
-
-    // Update these values with the expected values from
-    // the article.
-    assert.equal(title, 'Schrödinger’s Hack');
-  });
+it('returns the title', async () => {
+  // To pass this test, fill out the title selector
+  // in ./src/extractors/custom/www.newyorker.com/index.js.
+  const html = fs.readFileSync(
+    './fixtures/www.newyorker.com/1475245895852.html'
+  );
+  const articleUrl =
+    'http://www.newyorker.com/tech/elements/hacking-cryptography-and-the-countdown-to-quantum-computing';
+
+  const { title } = await Mercury.parse(articleUrl, html, { fallback: false });
+
+  // Update these values with the expected values from
+  // the article.
+  assert.equal(title, 'Schrödinger’s Hack');
+});
 ```
 
 As you can see, to pass this test, we need to fill out our title selector. In order to do this, you need to know what your selector is. To do this, open the html fixture the generator downloaded for you in the [`fixtures`](/fixtures) directory. In our example, that file is `fixtures/www.newyorker.com/1475248565793.html`. Now open that file in your web browser.
@@ -223,7 +225,7 @@ So, back to the title: We want to make sure our test finds the same title we see
 The selector for this title appears to be `h1.title`. To verify that we're right, click on the Console tab in Chrome's Developer Tools and run the following check:
 
 ```javascript
-$$('h1.title')
+$$('h1.title');
 ```
 
 If that returns only one match (i.e., an array with just one element), and the text of that element looks like the title we want, you're good to go!
@@ -247,7 +249,8 @@ export const NewYorkerExtractor = {
 Save the file, and... uh oh, our example still fails.
 
 ```javascript
-AssertionError: 'Hacking, Cryptography, and the Countdown to Quantum Computing' == 'Schrödinger’s Hack'
+AssertionError: 'Hacking, Cryptography, and the Countdown to Quantum Computing' ==
+  'Schrödinger’s Hack';
 ```
 
 When Mercury generated our test, it took a guess at the page's title, and in this case, it got it wrong. So update the test with thte title we expect, save it, and your test should pass!
@@ -259,7 +262,7 @@ We've been moving at a slow pace, but as you can see, once you understand the ba
 For a slightly more complex example, you'll find after a bit of looking that the best place to get the most accurate datetime on the page is in the head of the document, in the value attribute of a meta tag:
 
 ```html
-<meta value="2016-09-26T14:04:22-04:00" name="article:published_time">
+<meta value="2016-09-26T14:04:22-04:00" name="article:published_time" />
 ```
 
 As [explained above](#selecting-an-attribute), to return an attribute rather than the text inside an element, your selector should be an array where the first element is the element selector and the second element is the attribute you want to return. So, in this example, the date_published selector should look like this:
diff --git a/yarn.lock b/yarn.lock