-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy pathexperiments.txt
28 lines (18 loc) · 2.05 KB
/
experiments.txt
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
Exp1.1.1: {Input: pert-hw-img} → VLM → Error: Yes/No (Comparison against GT)
Exp1.1.2: {input: pert-hw-img} → VLM → CoT Reasoning behind why something is erroneous + Error: Yes/No
Exp1.1.3: {input: pert-hw-img} → VLM → Generate OCR internally + CoT Reasoning behind why something is erroneous based on the OCR + Error: Yes/No
Exp1.2.1: {Input: pert-hw-img} → VLM → Generate OCR text (LaTeX) → Same VLM / LLM (as baseline) → Error: Yes/No [Still evaluating VLMs]
Exp1.2.2: {Input: pert-hw-img} → VLM → Generate OCR text (LaTeX) → Same VLM / LLM (as baseline) → CoT Reasoning behind why something is erroneous + Error: Yes/No
Exp2.1: {Input: pert-hw-img} → VLM → {Output: CoT + Pinpoint Operator/Notational/Expression/Multistep Error in LaTeX}, {Input: Pinpoint Error in LaTeX, Perturbation Reason, Perturbed LaTeX} → Best LLM Evaluator → Match: Correct/Incorrect
Exp2.2: {Input: pert-hw-img} → VLM → Generate OCR → VLM → CoT + {Output: Pinpoint Operator/Notational/Expression/Multistep Error in LaTeX}, {Input: Pinpoint Error in LaTeX, Perturbation Reason, Perturbed LaTeX} → Best LLM Evaluator → Match: Correct/Incorrect
Exp3.1: {Input: pert-hw-img} → VLM → CoT + {Output: Corrected LaTeX}, {Input: Corrected LaTeX, Original LaTeX} → Best LLM Evaluator → Match: Correct/Incorrect
Exp3.2: {Input: pert-hw-img} → VLM → Generate OCR → VLM → CoT + {Output: Corrected LaTeX}, {Input: Corrected LaTeX, Original LaTeX} → Best LLM Evaluator → Match: Correct/Incorrect
_____________________________________
Steps
-------------------------------------
1. Prompt VLM for error detection. Input: Image, Output: Boolean
2. Prompt VLM for error detection with CoT. Input: Image, Output: Boolean and Reasoning
3. Prompt VLM for error detection with CoT and OCR. Input: Image, Output: Boolean, Reasoning and OCR.
4. Prompt VLM for OCR. Input: Image, Output: OCR
5. Prompt VLM for Error detection based on OCR. Input: OCR, Output: Boolean
6. Pormpt VLM for Error detection with CoT based on OCR. Input: OCR, Output: Boolean, Reasoning