Lint frontmatter (#5218)

This commit is contained in:
Boris Verkhovskiy 2024-12-18 09:38:58 -07:00 committed by GitHub
parent 4d7ecfbbf7
commit 770a4138b4
No known key found for this signature in database
GPG Key ID: B5690EEEBB952194
18 changed files with 180 additions and 80 deletions

View File

@ -3,4 +3,3 @@
- [ ] Pull request touches only one file (or a set of logically related files with similar changes made)
- [ ] Content changes are aimed at *intermediate to experienced programmers* (this is a poor format for explaining fundamental programming concepts)
- [ ] If you've changed any part of the YAML Frontmatter, make sure it is formatted according to [CONTRIBUTING.md](https://github.com/adambard/learnxinyminutes-docs/blob/master/CONTRIBUTING.md)
- [ ] Yes, I have double-checked quotes and field names!

View File

@ -11,8 +11,18 @@ jobs:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- uses: actions/setup-python@v5
with:
python-version: '3.13'
- run: pip install -r lint/requirements.txt
- uses: ruby/setup-ruby@v1
with:
ruby-version: '3.2'
- run: gem install mdl
- run: mdl . --ignore-front-matter -r MD003,MD011,MD023,MD027,MD028,MD035,MD037,MD038,MD039,MD047
- name: Files are UTF-8
run: ./lint/encoding.sh .
- name: Lint Markdown
run: mdl . --ignore-front-matter -r MD003,MD011,MD023,MD027,MD028,MD035,MD037,MD038,MD039,MD047
- name: Lint frontmatter
run: ./lint/frontmatter.py .

View File

@ -31,13 +31,6 @@ review them more effectively and/or individually.
language in question.
* Keep articles succinct and scannable. We all know how to use Google here.
* **Use UTF-8**
* For translations (or EN articles with non-ASCII characters) please ensure
your file is UTF-8 encoded.
* Leave out the byte-order-mark (BOM) at the start of the file (in Vim, use
`:set nobomb`).
* You can check if the file contains a BOM on Linux/Unix systems by running
`file language.html.markdown` You will see this if it uses a BOM:
`UTF-8 Unicode (with BOM) text`.
### Header configuration
@ -48,29 +41,32 @@ called frontmatter.
The following fields are necessary for English articles about programming
languages:
* **name** The human-readable name of the programming language
* **contributors** A list of [author, URL] lists to credit
* `name`: The human-readable name of the programming language
* `contributors`: A list of [*author*, *URL*] lists to credit, *URL* is optional
Other fields:
* **category**: The category of the article. So far, can be one of *language*,
* `category`: The category of the article. So far, can be one of *language*,
*tool* or *Algorithms & Data Structures*. Defaults to *language* if omitted.
* **filename**: The filename for this article's code. It will be fetched, mashed
* `filename`: The filename for this article's code. It will be fetched, mashed
together, and made downloadable.
Translations should also include:
* `translators`: A list of [*translator*, *URL*] lists to credit, *URL* is optional
Non-English articles inherit frontmatter values from the English article (if it exists)
but you can overwrite them.
Here's an example header for Ruby:
```yaml
*--
---
name: Ruby
filename: learnruby.rb
contributors:
- ["Doktor Esperanto", "http://example.com/"]
- ["Someone else", "http://someoneelseswebsite.com/"]
*--
---
```
### Syntax highlighter

View File

@ -2,7 +2,7 @@
filename: learngroovy-ca.groovy
contributors:
- ["Roberto Pérez Alcolea", "http://github.com/rpalcolea"]
translations:
translators:
- ["Xavier Sala Pujolar", "http://github.com/utrescu"]
---

View File

@ -3,10 +3,9 @@ contributors:
- ["Joao Marques", "http://github.com/mrshankly"]
- ["Dzianis Dashkevich", "https://github.com/dskecse"]
- ["Ryan Plant", "https://github.com/ryanplant-au"]
translator:
translators:
- ["Adrian Carrascal", "https://github.com/acarrascalgarcia"]
filename: learnelixir-es.ex
---
Elixir es un lenguaje funcional moderno construido sobre la máquina virtual de Erlang.

View File

@ -2,9 +2,8 @@
filename: LearnGit-es.txt
contributors:
- ["Jake Prather", "http://github.com/JakeHP"]
translator:
translators:
- ["Raúl Ascencio", "http://rscnt.github.io"]
---
Git es un sistema de control de versiones distribuido diseñado para manejar

View File

@ -4,10 +4,11 @@ contributors:
- ["Dzianis Dashkevich", "https://github.com/dskecse"]
- ["Ryan Plant", "https://github.com/ryanplant-au"]
- ["Ev Bogdanov", "https://github.com/evbogdanov"]
translator:
translators:
- ["Timothé Pardieu", "https://github.com/timprd"]
filename: learnelixir-fr.ex
---
Elixir est un langage de programmation fonctionnel moderne reposant sur la machine virtuelle BEAM, qui héberge aussi Erlang.
Il est totalement compatible avec Erlang mais dispose d'une syntaxe plus agréable et apporte de nouvelles fonctionnalités.

View File

@ -1,6 +1,5 @@
---
name: Groovy
filename: learngroovy.groovy
contributors:
- ["Roberto Pérez Alcolea", "http://github.com/rpalcolea"]
filename: learngroovy.groovy

View File

@ -1,7 +1,7 @@
---
contributors:
- ["Brett Taylor", "https://github.com/glutnix"]
translator:
translators:
- ["Agostino Fiscale", "https://github.com/agostinofiscale"]
filename: LearnComposer-it.sh
---

30
lint/encoding.sh Executable file
View File

@ -0,0 +1,30 @@
#!/bin/bash
check_encoding() {
file="$1"
encoding=$(file -b --mime-encoding "$file")
# Check if the encoding is neither UTF-8 nor US-ASCII
if [[ "$encoding" != "utf-8" && "$encoding" != "us-ascii" ]]; then
# Print the file path and encoding
echo "Error: $file has encoding $encoding, which is not utf-8 or us-ascii"
return 1
fi
# Check for UTF-8 BOM
if [[ "$encoding" == "utf-8" ]]; then
if head -c 3 "$file" | cmp -s <(echo -ne '\xEF\xBB\xBF'); then
echo "Error: $file contains a UTF-8 BOM"
return 1
fi
fi
return 0
}
export -f check_encoding
# Default to current directory if no argument is given
directory="${1:-.}"
find "$directory" -type f -name "*.md" -print0 | xargs -0 -P 8 -I {} bash -c 'check_encoding "$@"' _ {}

120
lint/frontmatter.py Executable file
View File

@ -0,0 +1,120 @@
#!/usr/bin/env python3
import re
from pathlib import Path
import yaml
import yamllint.config
import yamllint.linter
def extract_yaml_frontmatter(file_path):
"""Extracts YAML front matter from a Markdown file."""
with open(file_path, "r", encoding="utf-8") as file:
content = file.read()
matches = re.match(r"^(---\s*\n.*?\n)---\n", content, re.DOTALL)
if matches:
return matches.group(1)
return None
yaml_config = yamllint.config.YamlLintConfig(
"""{
extends: relaxed,
rules: {
commas: disable,
trailing-spaces: disable,
indentation: disable,
line-length: disable,
empty-lines: disable
}
}"""
)
def lint_yaml(yaml_content):
"""Lints YAML content using yamllint by sending it to stdin."""
problems = []
for p in yamllint.linter.run(yaml_content, yaml_config):
problems.append(f"{p.line}:{p.column} {p.desc} ({p.rule})")
return "\n".join(problems)
def validate_yaml_keys(yaml_content, allowed_keys):
"""Validates that the YAML content contains only the specified keys."""
try:
data = yaml.safe_load(yaml_content)
if not data:
return "Empty YAML front matter."
extra_keys = set(data.keys()) - set(allowed_keys)
if extra_keys:
return f"Invalid keys found: {', '.join(extra_keys)}"
for key, value_type in allowed_keys.items():
if key in data:
if not isinstance(data[key], value_type):
return f"Invalid type for key '{key}': expected {value_type.__name__}, got {type(data[key]).__name__}"
if isinstance(data[key], list):
for item in data[key]:
if not isinstance(item, list):
return f"Invalid type for item in key '{key}': expected list, got {type(item).__name__}"
elif not item:
return f"Invalid item in key '{key}': found empty list"
elif not isinstance(item[0], str):
return f"Invalid type for item[0] in key '{key}': expected str, got {type(item[0]).__name__}"
elif len(item) == 2 and not isinstance(item[1], str):
return f"Invalid type for item[1] in key '{key}': expected str, got {type(item[1]).__name__}"
elif len(item) > 2:
return f"Invalid length for item in key '{key}': expected 1 or 2, got {len(item)}"
except yaml.YAMLError as e:
return f"Error parsing YAML: {e}"
return ""
def process_files(path):
"""Processes either a single file or all Markdown files in a directory."""
if path.is_dir():
pathlist = path.rglob("*.md")
else:
pathlist = [path]
has_error = False
allowed_keys = {
"name": str,
"where_x_eq_name": str,
"category": str,
"filename": str,
"contributors": list,
"translators": list,
}
for path in pathlist:
yaml_content = extract_yaml_frontmatter(path)
if yaml_content:
lint_result = lint_yaml(yaml_content)
key_validation = validate_yaml_keys(yaml_content, allowed_keys)
if lint_result or key_validation:
if has_error: # don't prepend newline to first error
print()
print(path)
if lint_result:
print(lint_result)
if key_validation:
print(key_validation)
has_error = True
return has_error
def main(path_input):
"""Determines if the input is a directory or a file and processes accordingly."""
path = Path(path_input)
if not path.exists():
print(f"Error: {path_input} does not exist.")
return 1
return process_files(path)
if __name__ == "__main__":
import sys
path_input = sys.argv[1] if len(sys.argv) > 1 else "."
has_error = main(path_input)
sys.exit(1 if has_error else 0)

2
lint/requirements.txt Normal file
View File

@ -0,0 +1,2 @@
yamllint
pyyaml

View File

@ -3,7 +3,7 @@ contributors:
- ["Joao Marques", "http://github.com/mrshankly"]
- ["Dzianis Dashkevich", "https://github.com/dskecse"]
- ["Ryan Plant", "https://github.com/ryanplant-au"]
translator:
translators:
- ["Ev Bogdanov", "https://github.com/evbogdanov"]
filename: learnelixir-ru.ex
---

View File

@ -1,32 +0,0 @@
#!/usr/bin/env ruby
require 'charlock_holmes'
$file_count = 0;
markdown_files = Dir["./**/*.html.markdown"]
markdown_files.each do |file|
begin
contents = File.read(file)
detection = CharlockHolmes::EncodingDetector.detect(contents)
case detection[:encoding]
when 'UTF-8'
$file_count = $file_count + 1
when 'ISO-8859-1'
$file_count = $file_count + 1
when /ISO-8859/
puts "Notice: #{file} was detected as #{detection[:encoding]} encoding. Everything is probably fine."
$file_count = $file_count + 1
else
puts "WARNING #{file} was detected as #{detection[:encoding]} encoding. Please save the file in UTF-8!"
end
rescue Exception => msg
puts msg
end
end
files_failed = markdown_files.length - $file_count
if files_failed != 0
puts "FAILURE!!! #{files_failed} files were unable to be validated as UTF-8!"
puts "Please resave the file as UTF-8."
exit 1
else
puts "Success. All #{$file_count} files passed UTF-8 validity checks."
exit 0
end

View File

@ -1,21 +0,0 @@
#!/usr/bin/env ruby
require 'yaml';
$file_count = 0;
markdown_files = Dir["./**/*.html.markdown"]
markdown_files.each do |file|
begin
YAML.load_file(file)
$file_count = $file_count + 1
rescue Exception => msg
puts msg
end
end
files_failed = markdown_files.length - $file_count
if files_failed != 0
puts "FAILURE!!! #{files_failed} files were unable to be parsed!"
puts "Please check the YAML headers for the documents that failed!"
exit 1
else
puts "All #{$file_count} files were verified valid YAML"
exit 0
end

View File

@ -8,7 +8,6 @@ contributors:
- ["Jason Stathopulos", "http://github.com/SpiritBreaker226"]
- ["Milo Gilad", "http://github.com/Myl0g"]
- ["Adem Budak", "https://github.com/p1v0t"]
filename: LearnGit-tr.txt
---

View File

@ -4,7 +4,6 @@ contributors:
- ["Aleksey Kholovchuk", "https://github.com/vortexxx192"]
translators:
- ["GengchenXU", "https://github.com/GengchenXU"]
---
**Qt** Qt是一个广为人知的框架用于开发跨平台软件该软件可以在各种软件和硬件平台上运行代码几乎没有变化同时具有本机应用程序的能力和速度。虽然**Qt**最初是用*C*++,但也有其他语言的端口: *[PyQt](../pyqt/)*, *QtRuby*, *PHP-Qt*, 等等.