Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

What is the meaning of hacked? #33

Closed
effortprogrammer opened this issue Mar 12, 2023 · 5 comments
Closed

What is the meaning of hacked? #33

effortprogrammer opened this issue Mar 12, 2023 · 5 comments
Labels
question Further information is requested

Comments

@effortprogrammer
Copy link

Hey, I was reading your Readme.md and I saw that your repo was hacked. I want to ask what this means and wanted to check if the users like me also get the impact of hacking. Or, this is not the thing I should worry about?

@future10se
Copy link

If you're referring to the sentence:

This was hacked in an evening - I have no idea if it works correctly.

"hack" was more often used in a more positive sense by enthusiasts in the early days of computing before it took on the additional negative connotation in modern times.

From wiktionary.org:

The computer senses date back to at least 1955 when it initially referred to creative problem solving.

4. (computing) To make a quick code change to patch a computer program, often one that, while being effective, is inelegant or makes the program harder to maintain.
    Synonyms: frob, tweak
    I hacked in a fix for this bug, but we'll still have to do a real fix later.

5. (computing) To accomplish a difficult programming task.
    He can hack like no one else and make the program work as expected.

7. (computing, slang, transitive) To work with something on an intimately technical level. 

8. (transitive, colloquial, by extension) To apply a trick, shortcut, skill, or novelty method to something to increase productivity, efficiency or ease. 

It can still mean the act of compromising a system or obtaining unauthorized access, of course, but as with any word, context matters. (Like how you might've heard the phrase "life hack")

@effortprogrammer
Copy link
Author

Ohh,, my bad. My primary language is not English lol

@ggerganov
Copy link
Owner

ggerganov commented Mar 12, 2023

Here is a short summary of the implementation (a.k.a. "hacking") process if anyone is interested - might be useful for porting other models:

  • Started out with the GPT-J example from the ggml repo
  • Used the 4-bit branch of ggml since it has initial quantization support that we want
  • The LLaMA model has a very similar architecture to GPT-J. It uses the same positional encoding (RoPE), similar activation function (SiLU instead of GELU). The main differences are:
    • no bias tensors
    • some new normalization layers
    • extra tensor in the feed-forward part
    • a slightly different order of the operations
    • seems context size is not fixed? (if I understand correctly the code)
  • All these are trivial changes that can be applied to the GPT-J example just by looking at the original Python LLaMA code
  • Modified the Python conversion script to read the .pth file of 7B model and dump it to ggml format as usual
  • The tokenizer was obviously more complex and problematic, but made a quick hack to at least support it partially
  • This was enough to get the LLaMA-7B running. Later, the rest of the models became supported by figuring out how to merge the original parts of the model thanks to some references from community

Here is the LLaMA WIP branch in the ggml repo that I then migrated to become llama.cpp:
/~https://github.com/ggerganov/ggml/tree/llama

Through this process, there was no need to even run the original Python code. The downside is that I haven't had the chance to compare the outputs at different stages of the inference, so I have doubts about the correctness of this implementation. However, looking at the generated outputs, I guess it has to be correct.

@suprasteel

This comment was marked as spam.

@meganfox-cmd

This comment was marked as spam.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Projects
None yet
Development

No branches or pull requests

6 participants