Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CLI: sync report logic \w Linguist #214

Merged
merged 9 commits into from
Apr 9, 2019
Merged
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
13 changes: 11 additions & 2 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -188,9 +188,18 @@ as a set for the tests, the following issues were found:

* [Heuristics for ".es" extension](/~https://github.com/github/linguist/blob/e761f9b013e5b61161481fcb898b59721ee40e3d/lib/linguist/heuristics.yml#L103) in JavaScript could not be parsed, due to unsupported backreference in RE2 regexp engine

* As of (Linguist v5.3.2)[/~https://github.com/github/linguist/releases/tag/v5.3.2] it is using [flex-based scanner in C for tokenization](/~https://github.com/github/linguist/pull/3846). Enry stil uses [extract_token](/~https://github.com/github/linguist/pull/3846/files#diff-d5179df0b71620e3fac4535cd1368d15L60) regex-based algorithm. Tracked under /~https://github.com/src-d/enry/issues/193
* As of (Linguist v5.3.2)[/~https://github.com/github/linguist/releases/tag/v5.3.2] it is using [flex-based scanner in C for tokenization](/~https://github.com/github/linguist/pull/3846). Enry stil uses [extract_token](/~https://github.com/github/linguist/pull/3846/files#diff-d5179df0b71620e3fac4535cd1368d15L60) regex-based algorithm. [#193](/~https://github.com/src-d/enry/issues/193)

* Bayesian classifier cann't distinguish "SQL" vs "PLpgSQL". Tracked under /~https://github.com/src-d/enry/issues/194
* Bayesian classifier cann't distinguish "SQL" vs "PLpgSQL. [#194](/~https://github.com/src-d/enry/issues/194)

* Dection of [generated files](/~https://github.com/github/linguist/blob/bf95666fc15e49d556f2def4d0a85338423c25f3/lib/linguist/generated.rb#L53) is not supported yet.
(Thus they are not exclued from CLI output) [#213](/~https://github.com/src-d/enry/issues/213)

* XML detection strategy is not implemented. [#192](/~https://github.com/src-d/enry/issues/192)

* Overriding languaes and types though `.gitattributes` is not yet supported. [#18](/~https://github.com/src-d/enry/issues/18)

* enry CLI output does NOT exclude `.gitignore`ed files and submodel dirs as linguist does

`enry` [CLI tool](#cli) does not require a full Git repository to be present in filesystem in order to report languages.

Expand Down
5 changes: 4 additions & 1 deletion cmd/enry/main.go
Original file line number Diff line number Diff line change
Expand Up @@ -85,6 +85,7 @@ func main() {

if enry.IsVendor(relativePath) || enry.IsDotFile(relativePath) ||
enry.IsDocumentation(relativePath) || enry.IsConfiguration(relativePath) {
//TODO(bzz): skip enry.IsGeneratedPath() after /~https://github.com/src-d/enry/issues/213
if f.IsDir() {
return filepath.SkipDir
}
Expand All @@ -105,13 +106,15 @@ func main() {
log.Println(err)
return nil
}
//TODO(bzz): skip enry.IsGeneratedContent() after /~https://github.com/src-d/enry/issues/213

language := enry.GetLanguage(filepath.Base(path), content)
if language == enry.OtherLanguage {
return nil
}

// If we are displaying only prog, skip it
// If we are not asked to display all, do as
// /~https://github.com/github/linguist/blob/bf95666fc15e49d556f2def4d0a85338423c25f3/lib/linguist/blob_helper.rb#L382
if !*allLangs &&
enry.GetLanguageType(language) != enry.Programming &&
enry.GetLanguageType(language) != enry.Markup {
Expand Down