-
Notifications
You must be signed in to change notification settings - Fork 64
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Introducing Cukes #431
Introducing Cukes #431
Conversation
I like it! You'll have to see how it plays out to stick to this kind of high-level language in all of your scenarios. Some people might want to see more detail, yet not want to read the Ruby code underneath in the step definition. You can either address this with hand-crafted user docs (which could go in the Gherkin), or you could consider having one scenario per feature that uses more detail and has concrete examples of the JSON request / responses. I'm interested to see how this develops. |
I think as it stands, your requirements are manufactured to want Cucumber, so let's look at them:
The second step is the one you'll have to justify, and I'm not convinced.
Writing these tests in pseudocode won't make people read them. (Unit) tests themselves do not give big picture views of your code, and should not serve as documentation, or example code. If you have a comprehensive test suite, would you expect people to browse through them one-by-one to see if their use case works?
There is an xkcd cartoon here, but instead of using a language some people know, you're going to use a language no-one knows. Although some tests become 'readable', the maintenance of tests becomes labourious. Frequently for each line of english you have to write a custom regular expression to parse it. Your test runner won't be language agnostic either.
Except the implementation will be, and you will need to rewrite the ad-hoc regex rules in each language if you want to avoid this. This is a huge burden for tests and maintenance.
The use case of cucumber (afaik) was so that clients signing off on a project could agree to pages and pages of legalese that was executable, to enable the customer to both understand and specifiy what they want. This is not your use case. Your use case as presented is "We need better documentation and examples to showcase features and make it easier to get started with the API". Cucumber solves this if you think reading tests is a good substitute for documentation and worked, comprehensive examples in other languages :-) My experience with cucumber is that it made writing tests slow, and running tests slow, and debugging tests harder. |
Sorry if I've misunderstood the problem, but have you considered using a tool like api-blueprint to document the API, then use tools that test the API against the defined documentation? It adds some semantics on top of Markdown for documenting requests/responses/schemas etc. and it's pretty readable, even when browsing via Github. Admittedly it probably won't capture intricate scenarios as easily as a cuke scenario/bash script, but thought it might be worth bringing up. |
(nb I'm not against having documentation and examples which can be tested to check they work, but I don't think that is the same as having tests that serve as documentation and examples. c.f python's doctests and the problems with using them) |
@BRMatt , I'm not opposed, but one of the tricky bits here is that as Balanced moves even further towards hypermedia, we want to move away from documenting exact URLs and HTTP methods, and focus more on workflows. I don't have the most experience with api-blueprint, but generally, these kinds of tools tend to focus on 'RESTish' style documentation rather than this kind, so that's something that's in here too. |
@tef, I think your 'use case for cukes' is the same thing, just more cynical. ;) I actually didn't even come up with cukes until the last minute, so maybe they are described as a fit for cukes, but it happened in the other direction. I don't think new people will go read a bunch of cukes, but it would be nice to show as a response to this kind of question. |
@tef makes an important point that the demands of testing and documentation can come into conflict. The RSpec project have tried this, and found that the comprehensive test suite made for pretty boring user documentation. Users don't want to see every single edge case. My advice in this situation is usually to push the edge cases down into unit tests. If you think there's still benefit in having that complete specification readable by everyone, consider writing two sets of cukes - one that provides a user manual, and another that provides your complete specification. |
@tef regarding the issue of speed with the tests, I think that is something that will change soon but @mattwynne will be able to confirm this. I also believe that the cukes would help to have a quick understanding of how things are supposed to work, but in order to fully understand you would have to go further deep into the code. |
The python equivalent to cucumber is called lettuce. It a little immature compared to cucumber. Its usefulness may be found in the use cases of interoperability with any current testing tools and libraries you might have or want to use in the future written in python. Mainly wanted to let you know of its existence. |
@steveklabnik Fair enough, those are reasonable points. I believe the maintainers of api-blueprint are trying to work in support for hypermedia relations, but I'm not sure how that fits in with your plans. |
@steveklabnik do you see Balanced then implementing the underlying scenarios for the Gherkin in each client library to ensure they conform to a basic functionality (the spec)? As I'm sitting here figuring out how we can implement the rev1.1 clients I really like the idea of a basic set of functional tests across all client libraries so we don't end up with the issues we had last round such as under-utliized client libraries having issues opened because they are missing esoteric functionality and we don't realise until someone actually tries to use it. |
Here are my thoughts: I think Gherkin is the way to go here, definitely +1 on it. First and foremost, there are many Gherkin/Ragel parsers in many different languages. The official client libraries that are supported are: Ruby, Python, PHP, Java, C# and Node.js -- all of which have Gherkin/Ragel runners. So this is largely aiding #4 as well and helps with the training wheels. We attempted to something very similar to this in our documentation: /~https://github.com/balanced/balanced-docs/tree/master/clients -- maybe instead of having scenarios like this, we organize tests implemented in different languages and use those to feed into our API? Ben might have more thoughts on this topic. This is definitely a step in the right direction and however, I'd like to compare it to something like http://apiary.io or RAML. |
@mjallday maybe; I'm not sure how directly applicable they are, as the clients would need testing that they consume things correctly, and these are tests that stuff is produced correctly. That said, we should integration test both together, at some point. |
I just realized that I wrote a blog post a few years back that's directly relevant to this conversation http://blog.steveklabnik.com/posts/2011-12-20-write-better-cukes-with-the-rel-attribute |
What'd be great is a way to visualise the gherkin documentation online, with each step able to sho the raw response / differing language implementations... |
@JonRowe , that's what http://relishapp.com is for :) |
It allows you to visualise the documentation, but it's not very non-developer friendly, I found @chrismdp's rack-usermanual idea very intriguing in this respect. It also doesn't currently offer a means to inspect what it actually does / provide sample responses. |
Oh that is super cool! |
@steveklabnik we'd be happy to set you up with a free https://cucumber.pro/ account once we go live. Cucumber Pro will replace Relish. Regarding the idea to document the API with Cucumber. If I was a developer using the API, I would like to see scenarios like this: Feature: Payment
Scenario: tokenize
Given some precondition here
When I http POST to /cards with JSON body:
"""
{
"number": "4111111111111111",
"expiration_month": "12",
"expiration_year": 2016
}
"""
Then the response code should be 201 with:
"""
{
... whatever the expected reply is...
}
""" This is more verbose than what you suggested, but I think this level of detail is needed for the audience you're targeting (developers using the API) |
@aslakhellesoy thanks! Yeah, that makes sense; with something like this, I'm never 100% sure what the right level of abstraction is |
To find the right level of abstraction, ask some stakeholders (in your case, users of the API):
Looking at your feature I think most developers unfamiliar with your API would answer like this:
Feature: feature name
Scenario: scenario name
Some descriptive text here is useful
Given ... |
@aslakhellesoy that looks really good! 👍 |
We've been using a similar setup at @nutshellcrm. I think it has been working pretty well, though I admit it's tough to find the right level of abstraction. |
@steveklabnik thanks. Just realised the Rack::Usermanual demo links were broken: should be fixed now. |
@andrewsardone would you be interested in some help and / or collaboration to turn this into a gem, maybe? |
@tef, does this style do anything to mitigate your concerns, or you still think it's a terrible idea? |
👍 |
@aslakhellesoy @mattwynne , one more question: We have scenarios that operate like this: /~https://github.com/balanced/balanced-api/blob/revision1/scenarios/cards/tokenize_without_secret.yml#L20-L23 This (kinda terrible because of the runner) test basically says "run the 'tokenize' scenario, and then make a request to the URI that's located at This kind of thing is going to come up a lot as we move more and more towards hypermedia; "execute this thing, if it passes, execute this other thing, based on what comes back." These feel like two different tests to me, but Thoughts? |
I've thought about this, too. Another possible testing challenge is that you're building a client that could possibly travel several paths to a successful outcome. So it's not just A likely situation for hypermedia, but not something I am aware that current testing tools can handle. |
@steveklabnik Could you also add a scenario that debits/charges the card that you just tokenized? One of the most painful parts of our current scenarios is referencing the result of a previous scenario. This becomes even worse if you want to use some sort of a fixture to avoid doing repetitive tasks like tokenizing a card or creating a customer for different scenarios. |
Should be linked to #481
I am currently re-running the tests over every commit to make sure that bisect works. Assuming that's successful, it's time to finally merge this sucker. |
@steveklabnik I just noticed that there are some scenarios that do not appear to be implemented, might we double check that we have everything /~https://github.com/balanced/balanced-api/blob/revision1/scenarios/debits/create_new_card.yml |
Accidentally had two successes, now we have one and a fail.
Okay! I have re-gone through and double checked. We were missing one or two. Let's get that CI building. |
🎊 |
oh snap! 💃 |
@steveklabnik have you looked at 'behave' instead of Cucumber? It's Python and allegedly better than Lettuce. http://pythonhosted.org/behave/index.html |
@julianpistorius I have not, no. |
Please note this is not merge-able yet, this is mostly for discussion purposes.
Hey all! One of the first things I've been tasked with is improving the API specification. Over the weekend I asked around my network, did some research, and spiked out an idea, and I wanted to run it all past you.
Requirements
As always, the first step is to gather requirements. Here's what I gather we need for the API spec:
All three of these are pretty straightforward, but might as well spell them out.
This one is complicated. There are two kinds of concerns here: {in,out}side of Balanced, and across software language groups.
The first is the in/out part. Basically, while we have a very clear idea of what everything does, people who are new to Balanced won't. Just today I got an email from someone, asking if Balanced was appropriate for them, when their marketplace was basically identical to Rent My Bike. I'm happy to answer those kinds of questions, but we need something that will be easy for people who don't live and breathe payments to grok.
Secondly, software language groups. If we write our specs in Python, Pythonistas will be able to understand it, but Haskellers might not. Well, they probably will, but you know what I mean. Everyone wants their preferred language, but our product is an API, and it's language agnostic. Therefore, we need something that's not incredibly closely tied to a particular software language.
This is related to #4, but is a bit different. It's not enough that the spec is legible to everyone, but that it's easy to bikeshed. ;)
What we have
Right now, the API spec looks largely like this: /~https://github.com/balanced/balanced-api/blob/revision1/scenarios/cards/tokenize_without_secret.yml
It's a bunch of YAML files, with a totally custom, bash/Python runner.
This situation is good in that it's language agnostic, but bad in that it's 100% custom, so it's not natively understood by anyone, and while it communicates to some degree of what the specs do, it's almost 100% code. It took me some time to be able to actually understand what's going on, which will always happen, of course, but it's still hard to turn into actual words, which impedes #4 and #5.
Solution
I've spiked out a temporary solution based on a tool called "Cucumber." Before I tell you more, let's compare:
Current: /~https://github.com/balanced/balanced-api/blob/revision1/scenarios/cards/tokenize_without_secret.yml
Cucumber: /~https://github.com/balanced/balanced-api/blob/cukes/features/cards/tokenize.feature
The 'current' has two features, and the cucumber only has one, so ignore past line 18 on 'current.'
Cucumber, if you haven't heard of it, is a tool for collaboration between stakeholders. The idea is that you write high level descriptions of what you want your software to do in psudo-English ("Gherkin"), and then that gets transformed into code to actually run and test those descriptions.
While I've implemented the scenarios in Ruby, you can write the 'code' aspect of the tests in a variety of languages. You can also write the features in a variety of languages, too. For example: /~https://github.com/cucumber/cucumber/blob/master/examples/i18n/el/features/addition.feature
As for how this gets turned into code, I have two versions:
JSON Schema: /~https://github.com/balanced/balanced-api/blob/cukes/features/step_definitions/card_steps.rb
Assertions: /~https://github.com/balanced/balanced-api/blob/b91d4264eaace33373606191ef4d32ca1372ac47/features/step_definitions/card_steps.rb
Since we already assert against JSON Schema, I think sticking with that makes sense, as it'll be easier to port.
I actually think the "200 code" assertion may be too low level.
Pros
Basically, this is the exact use case for Cucumber. Because the specifications are written in English, we help make it easier for people to understand, as well as making it easier from people to collaborate across language boundaries. I have some issue with English being the de facto human language that technologists communicate in, but it is what it is, at least for now. This solves #4 and #5, our biggest problems, nicely.
Furthermore, collaboration can be done on the high-level language, without worrying about the details. For example, Matin had some concern over my wording of the last part of that scenario, because 'that's not exactly what that does.' Excellent, we have some sort of communication error here. Cucumber's high-level descriptions bring that kind of thing out, and people can submit pull requests with just the Gherkin, and we can add the implementation ourselves.
I wonder if "elevator pitches for your code" is a good way of describing Cucumber. Maybe I've been in San Francisco too long. ;)
Cons
Cucumber is... not a tool without some controversy. It was originally described as an 'acceptance testing' tool, and so people got into 'test mode' and started writing tests. Cucumber is very bad when used in this way. Eventually, even some helpers were removed [1] to discourage writing tests at a low level. Basically, if you're not writing very high-level descriptions, you are Cuking It Wrong. ;)
Cucumber can be slow. There's the whole Gherkin-code 'compilation' step, so it can take longer than other frameworks. This is mitigatable, but we're already planning on making HTTP requests against our API for our tests, so they'll be slow no matter what the framework is.
As I mentioned above, some people really, really dislike cukes. I think that's mostly due to having terrible Cucumber suites, people have developed some scars. This can be overcome, but there may be some initial 'eww.' There may be with our specs as of right now. ;)
Other options
Basically, if this is a bad idea, we should just move to some sort of very basic unit testing library in a not-to-unusual language. I'd basically use Python and Requests to do this, since it's what we're most familiar with overall. I think that the downsides of being language specific is pretty big, though.
Conclusion
I'm sure I've forgotten a few things, but I'd love to hear feedback, good and bad. I think this is the right move, but I don't always have all the right answers... I would especially love feedback from @mattwynne and @aslakhellesoy .
1: http://aslakhellesoy.com/post/11055981222/the-training-wheels-came-off
Scenarios
CRUD tests
Here's all the cases we need to cover to merge. From https://docs.balancedpayments.com/1.1/api/ :
API KEYS
BANK ACCOUNT VERIFICATIONS
BANK ACCOUNTS
CALLBACKS
CARDS
CREDITS
CUSTOMERS
DEBITS
EVENTS
REFUNDS
REVERSALS
non-crud tests
That's exhaustive. Let's do this.