Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Index support for enum values #10

Closed
Gummibeer opened this issue Mar 13, 2019 · 16 comments
Closed

Index support for enum values #10

Gummibeer opened this issue Mar 13, 2019 · 16 comments
Assignees

Comments

@Gummibeer
Copy link
Collaborator

Gummibeer commented Mar 13, 2019

enumerable
adjective MATHEMATICS

able to be counted by one-to-one correspondence with the set of all positive integers.

This is the current definition of enumerable. The current Enum class does not implement this - it's only a string map. Instead it should be a map of strings to integers (positive).
So the current implementation should be changed at multiple points.

  • force the value to be integer
  • force the key to be string
  • force key and value to be unique

An idea would be to switch over to constants (key forced to be string and unique). While looping over all constants the value could be also checked for uniqueness.
The results of these reflection class operations could be cached in a static variable(s).

The definition of the class will also look more like an enum in this case.

/**
 * @method static self sunday()
 * @method static self monday()
 * @method static self tuesday()
 * @method static self wednesday()
 * @method static self thursday()
 * @method static self friday()
 * @method static self saturday()
 */
final class Days extends Enum
{
  const SUNDAY = 0;
  const MONDAY = 1;
  const TUESDAY = 2;
  const WEDNESDAY = 3;
  const THURSDAY = 4;
  const FRIDAY = 5;
  const SATURDAY = 6;
}

Constants like FOO_BAR could be converted to lowerCamelCase fooBar().

The usage could be Days::monday() to get a valued instance of the Enum. Days::from($day)->equals(Days::MONDAY) or without type safety Days::from($day) == Days::MONDAY.

Similar behavior is implemented in SplEnum http://php.net/manual/de/class.splenum.php without the other functionalities provided by this package.

@Gummibeer
Copy link
Collaborator Author

PS: from() isn't a nice wording for me - a common standard is make() or create().

@brendt
Copy link
Collaborator

brendt commented Mar 14, 2019

Here's the definition of enums in programming languages, according to Wikipedia:

In computer programming, an enumerated type (also called enumeration, enum, or factor in the R programming language, and a categorical variable in statistics) is a data type consisting of a set of named values called elements, members, enumeral, or enumerators of the type. The enumerator names are usually identifiers that behave as constants in the language.

I think this library does adhere to this definition: "a data type consisting of a set of named values".

Furthermore, it is already possible to map integers to names values by using value methods: /~https://github.com/spatie/enum#enum-specific-methods

While I get what you're saying, I think it misses the point of the package. The core idea behind it is to eliminate the inconsistency between the methods and constant values, as other enum packages do. I've explained this in detail here: https://stitcher.io/blog/php-enums

I agree that from might not be the best name. We might change that in a later version.

I'll be closing the issue, but feel free to continue the conversation.

@brendt brendt closed this as completed Mar 14, 2019
@Gummibeer
Copy link
Collaborator Author

Yes, the two definitions don't conflict - the programming one only misses the fact that the named values have a numerical counter part.

Let's stick to your "perfect world" example:

enum PostStatus {
    DRAFT, PUBLISHED, ARCHIVED;
}

If I convert it to this package it would be:

/**
 * @method static self DRAFT()
 * @method static self PUBLISHED()
 * @method static self ARCHIVED()
 */
class PostStatus extends Enum {}

If I now call PostStatus::DRAFT() it will return DRAFT instead of 0 which would be the return of all languages I know that support enums.

So yes, it's possible at the moment - but it isn't the default behavior. The default behavior is a limited string collection which are usable as string or object. So the default should return the index of the @method annotation. This could be extended to a user defined index like: @method static self DRAFT() 2.

Also MySQL as a database which support enum keeps them as a string to array map. https://dev.mysql.com/doc/refman/5.7/en/enum.html

I concede that the string return value works better with MySQL because there index starts with 1 and the recommendation is to use the string value instead of the index.

@brendt
Copy link
Collaborator

brendt commented Mar 14, 2019

We could add a numeric index. Though I'm unsure about enums being represented by their numeric value: it doesn't have any meaning. The status "draft" and its index "0" don't have meaning besides it's current position.

Say we were to add another enum value two years after a project has been in production. Let's call it PRE_DRAFT just for clarity. This values comes before DRAFT. Now we have two options: either add a database migration to shift all the current stored values by one, or add the PRE_DRAFT status at the end. This in turn might break your code, if you're comparing statuses based on their numeric index.

I know all of this is solvable, but at what cost? Do we really want this package to be this brittle? Do we want programmers to be concerned about breaking production every time they make such a trivial change?

The only argument I can think of in favour of storing numeric indices is performance. But if your project is large enough for these kinds of things to become performance issues, I recon you won't have any problems with changing the default behaviour of this package.

Please feel free to share your thoughts! I'm open to be proven wrong here.

@Gummibeer
Copy link
Collaborator Author

Like I said - if it comes to DB the current implementation works better

I concede that the string return value works better with MySQL because there index starts with 1 and the recommendation is to use the string value instead of the index.

I haven't checked what MySQL uses under the hood, behind the strings it also stores tiny-ints, but I also don't know what happens if I add a new value to the MySQL enum. This would be a pretty good question.^^
Stackexchange: https://dba.stackexchange.com/q/6547/68090
Following multiple posts there changing enums in MySQL is in all cases a pain. Benefit, real enum columns are a lot faster than simple varchar ones.

The benefit of numerics is: performance, size, don't care about casing and option to compare/order them.

My primary issue is more a conceptual and theoretical one - keep this Enum class as close as possible to a real enum. And a real enum would be something like:

enum PostStatus {
    DRAFT, PUBLISHED, ARCHIVED
}
PostStatus::DRAFT // 0
PostStatus::PUBLISHED // 1
PostStatus::ARCHIVED // 2

or with manual defined first index

enum PostStatus {
    DRAFT=2, PUBLISHED, ARCHIVED
}
PostStatus::DRAFT // 2
PostStatus::PUBLISHED // 3
PostStatus::ARCHIVED // 4

@Gummibeer
Copy link
Collaborator Author

After a night and the fact that an userland Implementation should make things easier at first doesn't have to be 100% definition compliant I have an idea.

The __toString() method still returns the string value draft. But the getNumericIndex() or just getIndex() will return the calculated index of the string value.

Atm I don't have an idea how to define a custom index via all options to define values. As doc-tag it's easy but via the $map variable it isn't. 🤔 The only way I see for this case is the already described one to override the method in an anonymous class.

@brendt
Copy link
Collaborator

brendt commented Mar 15, 2019

I really appreciate you thinking along, thanks!

Some more background information: the first implementation of this package supported the @method static self DRAFT() 2 syntax. We moved from it to the $map variable because it seemed like too much "custom" interpretation of doc blocks.

To be honest: I don't like $map, and personally I'd never use it: I'd always use the value methods.

I'm open for adding getIndex(), the index should be generated automatically based on the actual index of the method, but also be overridable. So there's two design decisions to be made:

  • How are we going to override the index? It could be done via $map, the docblock description or via value specific implementations with anonymous classes.
  • We should improve serialization and deserialisation. Ideally you'd want to be able to choose how to store enums in a database: with their index or with their string value. re-constructing an enum from a database value should also support both.

Are you motivated to try and PR a solution? Otherwise I'll take a look at it somewhere in the next weeks, and we can review it together.

@Gummibeer
Copy link
Collaborator Author

Gummibeer commented Mar 15, 2019

No problem! :)
I will check it out and try to do it. Should the $map get removed at all? I'm not a big fan of a class that is configurable via thousand ways. It introduces useless complexity and the $map doesn't come with any benefit vs annotations. Instead it misses important features.

Because of the fact that this would be BC at all we could also drop this variable.

Something I would add as a constraint: you can't mix defined index and auto index. Or at least not without conflicts.
To check this would introduce a lot of complexity. Something that could be possible would be to define the starting index. and the following would use this as base.

Deserialization would be easy. For serialization you can use the methods getKey()/getValue() or getIndex(). In addition it could be configured via an property - if you want to get the PostStatus as string everytime and the Day as int.

@brendt
Copy link
Collaborator

brendt commented Mar 15, 2019

I'm fine with dropping $map and target a 2.0 release :)

Something I would add as a constraint: you can't mix defined index and auto index.

That sounds good!

@Gummibeer
Copy link
Collaborator Author

What do you think about @method static self draft(int $index = 2) Official method description?

This would be more effort to write but allows to use the method description for what it's made for and follows official php-doc rules. The only con would be that it suggests the IDE and developer that there is an optional argument.

https://www.phpliveregex.com/p/rit

@brendt
Copy link
Collaborator

brendt commented Mar 18, 2019

I'm afraid it would be confusing to the developer though. As if you could pass another value to the draft method. I'm not saying it's a no go, but I'd like to see some other ways of overriding an index. Right now these are the possibilities we already touched on:

  • As the doc comment description
  • Via $map, but I think we both agree that we don't like this.
  • Via the doc comment method signature
  • Via value specific anonymous classes

Maybe we should take a step back and ask ourselves how often a user would want to override the index? Wouldn't value specific anonymous classes suffice in those cases? Say we'd add a getIndex method on the enum class with a default implementation, but one you could override if needed on a per-value basis.

As always: I'm thinking aloud so feel free to say I'm wrong 😁

@Gummibeer
Copy link
Collaborator Author

@brendt I 100% agree! During scribbling around I noticed that we have two ways, and will have to keep them, to declare values - doc-block & static method.
Atm we first load all static methods and after this load the doc-block. If we generate auto indices how should we work with this? Static methods are sorted from up to down, depending on your code-style this could be alphabetical or logical.
Doc-Block methods could also be in the middle. How should we handle this?

And if we ask the question you've already opened

Maybe we should take a step back and ask ourselves how often a user would want to override the index?
how often a user would want indices?

So far I'm the only one and so far I work with constants and a defined getIndex() method. Here we get to the $map again. Even if it's duplicated I think the only way to give the user full control of the index for a value is the $map and the getIndex() method.

@brendt
Copy link
Collaborator

brendt commented Mar 18, 2019

If we generate auto indices how should we work with this?

After an enum is "parsed", its contents are stored into Enum::$cache. It's unfortunate that we need a singleton on the Enum class, but there's no way around this, as far as I'm aware.

Enum::$cache contains an array of enum values per specific enum class. My guess is that we could use that array index. We can store it the same way we're storing $value when constructing an enum: /~https://github.com/spatie/enum/blob/master/src/Enum.php#L56

My first approach would be to add it in the constructor, we could use array_search to find the index based on the given value.

Overriding indices should be supported in Enum::resolve. This is of course the difficult part. At the moment we're caching enum values, but now we need to cache both the value and the index. Maybe we need a new object representing an enum value, instead of caching the values themselves:

$enumValues[$value] = $name;

Would become something like:

$enumValues[$enumValueDefinition->getIndex()] = $enumValueDefinition;

Both resolveValuesFromStaticMethods and resolveFromDocblocks should in their turn return an array of these EnumValueDefinition objects. Feel free to suggest another name!

@brendt
Copy link
Collaborator

brendt commented Mar 18, 2019

As a sidenote: we could consider storing the enum value definitions in an array, instead of another object. Ideally we'd want to use a struct here, but we're still working with PHP, of course 😁

I'd consider using an array for performance reasons, though maybe a class with public properties is actually more performant than arrays...

@brendt
Copy link
Collaborator

brendt commented Mar 18, 2019

Let's re-open this issue btw, but I'm going to change to title to better reflect what it is about.

@brendt brendt changed the title Enum doesn't implement enumerable but only string maps Index support for enum values Mar 18, 2019
@brendt brendt reopened this Mar 18, 2019
@Gummibeer
Copy link
Collaborator Author

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants