Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement the Terraform Lexer #917

Merged
merged 3 commits into from
Jun 19, 2018
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
7 changes: 7 additions & 0 deletions lib/rouge/demos/hcl
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
service {
key = "value"
}

variable "ami" {
description = "the AMI to use"
}
31 changes: 31 additions & 0 deletions lib/rouge/demos/terraform
Original file line number Diff line number Diff line change
@@ -0,0 +1,31 @@
# From: /~https://github.com/terraform-providers/terraform-provider-aws/blob/master/examples/count/main.tf

# Specify the provider and access details
provider "aws" {
region = "${var.aws_region}"
}

resource "aws_elb" "web" {
name = "terraform-example-elb"

# The same availability zone as our instances
availability_zones = ["${aws_instance.web.*.availability_zone}"]

listener {
instance_port = 80
instance_protocol = "http"
lb_port = 80
lb_protocol = "http"
}

# The instances are registered automatically
instances = ["${aws_instance.web.*.id}"]
}

resource "aws_instance" "web" {
instance_type = "m1.small"
ami = "${lookup(var.aws_amis, var.aws_region)}"

# This will create 4 instances
count = 4
}
162 changes: 162 additions & 0 deletions lib/rouge/lexers/hcl.rb
Original file line number Diff line number Diff line change
@@ -0,0 +1,162 @@
# -*- coding: utf-8 -*- #

module Rouge
module Lexers
class Hcl < RegexLexer
tag 'hcl'

title 'Hashicorp Configuration Language'
desc 'Hashicorp Configuration Language, used by Terraform and other Hashicorp tools'

state :multiline_comment do
rule %r([*]/), Comment::Multiline, :pop!
rule %r([^*/]+), Comment::Multiline
rule %r([*/]), Comment::Multiline
end

state :comments_and_whitespace do
rule /\s+/, Text
rule %r(//.*?$), Comment::Single
rule %r(#.*?$), Comment::Single
rule %r(/[*]), Comment::Multiline, :multiline_comment
end

state :primitives do
rule /[0-9][0-9]*\.[0-9]+([eE][0-9]+)?[fd]?([kKmMgG]b?)?/, Num::Float
rule /[0-9]+([kKmMgG]b?)?/, Num::Integer

rule /"/, Str::Double, :dq
rule /'/, Str::Single, :sq
rule /(<<-?)(\s*)(\'?)(\\?)(\w+)(\3)/ do |m|
groups Operator, Text, Str::Heredoc, Str::Heredoc, Name::Constant, Str::Heredoc
@heredocstr = Regexp.escape(m[5])
push :heredoc
end
end
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any reason for putting these above the def self.* methods and also above the root state? While it's a minor style thing, it's customary to have root be the first state and methods to appear at the top before states. Maybe there's a specific thought you had in mind, though?

Copy link
Contributor Author

@lowjoel lowjoel Jun 13, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not quite, I just took the ordering found in the JavaScript lexer: /~https://github.com/jneen/rouge/blob/master/lib/rouge/lexers/javascript.rb. It kind of made sense since this set of keywords may be overridden in derived lexers (Terraform redefines these)

What do you recommend?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm OK with this.


def self.keywords
@keywords ||= Set.new %w()
end

def self.declarations
@declarations ||= Set.new %w()
end

def self.reserved
@reserved ||= Set.new %w()
end

def self.constants
@constants ||= Set.new %w(true false null)
end

def self.builtins
@builtins ||= %w()
end

id = /[$a-z_][a-z0-9_]*/io

state :root do
mixin :comments_and_whitespace
mixin :primitives

rule /\{/ do
token Punctuation
push :hash
end
rule /\[/ do
token Punctuation
push :array
end

rule id do |m|
if self.class.keywords.include? m[0]
token Keyword
push :composite
elsif self.class.declarations.include? m[0]
token Keyword::Declaration
push :composite
elsif self.class.reserved.include? m[0]
token Keyword::Reserved
elsif self.class.constants.include? m[0]
token Keyword::Constant
elsif self.class.builtins.include? m[0]
token Name::Builtin
else
token Name::Other
push :composite
end
end
end

state :composite do
mixin :comments_and_whitespace

rule /[{]/ do
token Punctuation
pop!
push :hash
end

rule /[\[]/ do
token Punctuation
pop!
push :array
end

mixin :root

rule //, Text, :pop!
end

state :hash do
mixin :comments_and_whitespace

rule /\=/, Punctuation
rule /\}/, Punctuation, :pop!

mixin :root
end

state :array do
mixin :comments_and_whitespace

rule /,/, Punctuation
rule /\]/, Punctuation, :pop!

mixin :root
end

state :dq do
rule /[^\\"]+/, Str::Double
rule /\\"/, Str::Escape
rule /"/, Str::Double, :pop!
end

state :sq do
rule /[^\\']+/, Str::Single
rule /\\'/, Str::Escape
rule /'/, Str::Single, :pop!
end

state :heredoc do
rule /\n/, Str::Heredoc, :heredoc_nl
rule /[^$\n]+/, Str::Heredoc
rule /[$]/, Str::Heredoc
end

state :heredoc_nl do
rule /\s*(\w+)\s*\n/ do |m|
if m[1] == @heredocstr
token Name::Constant
pop! 2
else
token Str::Heredoc
end
end

rule(//) { pop! }
end
end
end
end
104 changes: 104 additions & 0 deletions lib/rouge/lexers/terraform.rb
Original file line number Diff line number Diff line change
@@ -0,0 +1,104 @@
# -*- coding: utf-8 -*- #

module Rouge
module Lexers
load_lexer 'hcl.rb'

class Terraform < Hcl
title "Terraform"
desc "Terraform HCL Interpolations"

tag 'terraform'
aliases 'tf'
filenames '*.tf'

def self.keywords
@keywords ||= Set.new %w(
terraform module provider variable resource data provisioner output
)
end

def self.declarations
@declarations ||= Set.new %w(
var local
)
end

def self.reserved
@reserved ||= Set.new %w()
end

def self.constants
@constants ||= Set.new %w(true false null)
end

def self.builtins
@builtins ||= %w()
end

state :strings do
rule /\\./, Str::Escape
rule /\$\{/ do
token Keyword
push :interpolation
end
end

state :dq do
rule /[^\\"\$]+/, Str::Double
mixin :strings
rule /"/, Str::Double, :pop!
end

state :sq do
rule /[^\\'\$]+/, Str::Single
mixin :strings
rule /'/, Str::Single, :pop!
end

state :heredoc do
rule /\n/, Str::Heredoc, :heredoc_nl
rule /[^$\n\$]+/, Str::Heredoc
rule /[$]/, Str::Heredoc
mixin :strings
end

state :interpolation do
rule /\}/ do
token Keyword
pop!
end

mixin :expression
end

id = /[$a-z_\-][a-z0-9_\-]*/io

state :expression do
mixin :primitives
rule /\s+/, Text

rule %r(\+\+ | -- | ~ | && | \|\| | \\(?=\n) | << | >>>? | == | != )x, Operator
rule %r([-<>+*%&|\^/!=?:]=?), Operator
rule /[(\[,]/, Punctuation
rule /[)\].]/, Punctuation

rule id do |m|
if self.class.keywords.include? m[0]
token Keyword
elsif self.class.declarations.include? m[0]
token Keyword::Declaration
elsif self.class.reserved.include? m[0]
token Keyword::Reserved
elsif self.class.constants.include? m[0]
token Keyword::Constant
elsif self.class.builtins.include? m[0]
token Name::Builtin
else
token Name::Other
end
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why are declarations/keywords/etc. tokenized here and in the root state of the hcl lexer? Can you give me an example of each case?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

expression is used inside interpolations.

resource "something" {
}

resource "other" {
  key = "${resource.something.key}" # This is handled by the expression state.
}

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK.

end
end
end
end
end
27 changes: 27 additions & 0 deletions spec/lexers/terraform_spec.rb
Original file line number Diff line number Diff line change
@@ -0,0 +1,27 @@
# -*- coding: utf-8 -*- #

describe Rouge::Lexers::Terraform do
let(:subject) { Rouge::Lexers::Terraform.new }

include Support::Lexing
it 'parses a basic Terraform file' do
tokens = subject.lex('terraform {}').to_a
assert { tokens.size == 3 }
assert { tokens.first[0] == Token['Keyword'] }
end

describe 'guessing' do
include Support::Guessing

it 'guesses by filename' do
assert_guess :filename => 'foo.tf'
deny_guess :filename => 'foo'
end

it 'guesses by mimetype' do
end

it 'guesses by source' do
end
end
end
1 change: 1 addition & 0 deletions spec/visual/samples/hcl
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
# See Terraform lexer
Loading