← Back to Benchmark Results

openai/gpt-5.3-codex

73.2%

Pass Rate

41/56

Tasks Passed

Runs

69.6%

pass@1

73.2%

pass@3

92.9%

Consistency

0.1

Temperature

Thinking

434,527

Tokens

$3.68

Cost

1st: 852nd: 32Failed: 1541/56 passed

Known Shortcomings (14)

Sorted by occurrence count (most frequent first)

#	Concept	AL Concept	Count	Affected Tasks
1	query-object-syntax	query-definition	2	CG-AL-H011, CG-AL-H017
Description: The model failed to generate any valid AL code for the query object. The compilation errors at line 12 (identifier expected, '=' expected) and the note 'Generated code not found' suggest the model either produced no output or produced syntactically invalid AL for a Query object. The model likely does not know the correct syntax for defining an AL Query object with dataitems, columns, aggregation methods (Sum, Count), column filters, and ordering. AL Query objects have a specific structure with 'query', 'elements', 'dataitem', 'column', and 'filter' keywords that differs significantly from other AL object types. Correct Pattern: `query 70011 "CG Sales Summary" { QueryType = Normal; OrderBy = ascending(Sell_to_Customer_No); elements { dataitem(SalesLine; "Sales Line") { column(Document_No; "Document No.") { } column(Sell_to_Customer_No; "Sell-to Customer No.") { } column(Line_Amount_Sum; "Line Amount") { Method = Sum; } column(Line_Count; "Line Amount") { Method = Count; } filter(Document_Type; "Document Type") { ColumnFilter = const(Order); } } } }` Incorrect Pattern: `// Generated code not found (or syntactically invalid query definition at line 12)` Error Codes: AL0107, AL0353
2	empty-or-missing-code-generation	interface-definition	2	CG-AL-H021, CG-AL-M009
Description: The model failed to generate any valid AL code. The generated code appears to be empty or contains no recognizable AL object declarations. The task required creating an interface 'INotificationChannel', three implementing codeunits (70221, 70222, 70223), and a manager codeunit (70220) using List of [Interface] and Dictionary of [Text, Interface] collections. The compilation errors (AL0107 'identifier expected', AL0198 'Expected one of the application object keywords') at position 1:11 indicate the very first line of the output file is not a valid AL object declaration. The model either produced no code, produced a non-AL response, or failed to structure the output as proper AL objects. Correct Pattern: interface "INotificationChannel" { procedure Send(Message: Text): Boolean; procedure GetChannelName(): Text; } codeunit 70221 "CG Email Channel" { Access = Public; implements "INotificationChannel"; procedure Send(Message: Text): Boolean begin exit(true); end; procedure GetChannelName(): Text begin exit('Email'); end; } codeunit 70222 "CG SMS Channel" { Access = Public; implements "INotificationChannel"; procedure Send(Message: Text): Boolean begin exit(true); end; procedure GetChannelName(): Text begin exit('SMS'); end; } codeunit 70223 "CG Slack Channel" { Access = Public; implements "INotificationChannel"; procedure Send(Message: Text): Boolean begin exit(true); end; procedure GetChannelName(): Text begin exit('Slack'); end; } codeunit 70220 "CG Notification Manager" { Access = Public; var ChannelList: List of [Interface "INotificationChannel"]; ChannelDict: Dictionary of [Text, Interface "INotificationChannel"]; procedure RegisterChannel(Channel: Interface "INotificationChannel") begin ChannelList.Add(Channel); end; procedure BroadcastMessage(Message: Text): Integer var Channel: Interface "INotificationChannel"; SuccessCount: Integer; begin foreach Channel in ChannelList do if Channel.Send(Message) then SuccessCount += 1; exit(SuccessCount); end; procedure GetRegisteredChannelNames(): List of [Text] var Channel: Interface "INotificationChannel"; Names: List of [Text]; begin foreach Channel in ChannelList do Names.Add(Channel.GetChannelName()); exit(Names); end; procedure RegisterNamedChannel(Name: Text; Channel: Interface "INotificationChannel") begin ChannelDict.Set(Name, Channel); end; procedure SendToChannel(Name: Text; Message: Text): Boolean var Channel: Interface "INotificationChannel"; begin if ChannelDict.Get(Name, Channel) then exit(Channel.Send(Message)); exit(false); end; procedure GetChannelByName(Name: Text; var Channel: Interface "INotificationChannel"): Boolean begin exit(ChannelDict.Get(Name, Channel)); end; procedure ClearChannels() begin Clear(ChannelList); Clear(ChannelDict); end; } Incorrect Pattern: `// Generated code not found` Error Codes: AL0107, AL0104
3	parse-failure	unknown	2	CG-AL-M001, CG-AL-M112
Description: Failed to parse LLM analysis response: Looking at this failure, I need to analyze the compilation errors carefully: 1. AL0118/AL0132 errors about `"No."`: The test references `Product."No."` and `Product.SetRange("No.", ...)`, but the Correct Pattern: Incorrect Pattern:
4	dictionary-keys-method-signature	collection-types	1	CG-AL-H020
Description: The model incorrectly called Dictionary.Keys() with an argument (e.g., Dict.Keys(KeyList)), but in AL the Keys() method takes no arguments and returns a List directly. The correct pattern is to assign the result: KeyList := Dict.Keys(). This error occurs in multiple places: MergeDictionaries (iterating Dict1 and Dict2 keys) and GetKeys (returning dictionary keys). Correct Pattern: `KeyList := Dict.Keys()` Incorrect Pattern: `Dict.Keys(KeyList)` Error Codes: AL0126
5	al-syntax-basics	codeunit-definition	1	CG-AL-M005
Description: The model generated AL code with a syntax error at line 255, column 60. The generated code was not captured/found in the output ('Generated code not found'), but the compilation errors (AL0104: Syntax error, ')' expected and '}' expected) indicate the model produced malformed AL code - likely incorrect procedure signatures, malformed expressions, or improperly structured code blocks. The task and test definitions are valid; the model simply failed to produce syntactically correct AL code for the External Payment Service codeunit. Correct Pattern: `A properly structured codeunit 70002 'External Payment Service' with correct AL syntax for all procedure signatures including SendPaymentRequest, ValidatePaymentResponse, GetPaymentStatus, HandlePaymentWebhook, and LogPaymentTransaction, using proper HttpClient patterns and JSON handling.` Incorrect Pattern: `// Generated code not found - but line 255 has syntax error at column 60` Error Codes: AL0104
6	json-typed-getter-methods	json-api-usage	1	CG-AL-H014
Description: The model failed to generate valid AL code for JSON parsing. The compilation error AL0133 at line 47:33 indicates the model tried to pass a Text value where a Boolean was expected, likely misusing JsonObject methods. In AL, JsonObject doesn't have typed getter methods like GetText(), GetInteger(), GetBoolean() directly. Instead, you must use Get() to retrieve a JsonToken, then convert it to a JsonValue, and call AsText(), AsInteger(), AsBoolean(), AsDecimal() etc. The model either didn't generate code at all (the generated code section says 'not found' but the error suggests code was generated) or generated code using incorrect JSON API patterns. The task description itself uses fictional method names like 'GetText()', 'GetInteger()', 'GetBoolean()', 'GetArray()' which don't exist on JsonObject in AL - the correct approach is JsonObject.Get(Key, JsonToken) then JsonToken.AsValue().AsText() etc. However, this is still a model knowledge gap since a knowledgeable model should know the correct AL JSON API and translate the task intent accordingly. Correct Pattern: `var JToken: JsonToken; JValue: JsonValue; begin if CustomerJson.Get('active', JToken) then begin JValue := JToken.AsValue(); active := JValue.AsBoolean(); end;` Incorrect Pattern: `// Line 47 likely contained something like: active := CustomerJson.GetBoolean('active'); or similar incorrect JSON API usage with wrong parameter types` Error Codes: AL0133
7	secrettext-isolated-storage-api	secrettext-handling	1	CG-AL-H016
Description: The model generated code that attempts to pass SecretText values to IsolatedStorage.Set and IsolatedStorage.Get methods incorrectly. The AL0133 errors ('cannot convert from SecretText to Joker') at lines 23 and 39 indicate the model did not correctly use the IsolatedStorage overloads that accept SecretText. In AL, IsolatedStorage has specific overloads for SecretText: IsolatedStorage.Set(Text, SecretText [, DataScope]) for storing and IsolatedStorage.Get(Text, DataScope, var SecretText) for retrieving. The model likely tried to use the wrong overload or passed arguments in the wrong order/type, causing the type conversion errors. The generated code was not captured but the compilation errors clearly show the model failed to correctly call IsolatedStorage APIs with SecretText parameters. Correct Pattern: procedure StoreApiKey(ApiKey: SecretText) begin IsolatedStorage.Set('CG_API_KEY', ApiKey, DataScope::Module); end; procedure RetrieveApiKey(): SecretText var StoredKey: SecretText; begin IsolatedStorage.Get('CG_API_KEY', DataScope::Module, StoredKey); exit(StoredKey); end; procedure BuildAuthHeader(ApiKey: SecretText): Text begin exit('Bearer ' + ApiKey.Unwrap()); end; procedure MaskSecret(Secret: SecretText): Text var Unwrapped: Text; begin Unwrapped := Secret.Unwrap(); if StrLen(Unwrapped) <= 4 then exit(Unwrapped + '**'); exit(CopyStr(Unwrapped, 1, 4) + ''); end; Incorrect Pattern: `IsolatedStorage.Set('CG_API_KEY', ApiKey, DataScope::Module); // line 23 area - incorrect SecretText usage IsolatedStorage.Get('CG_API_KEY', DataScope::Module, StoredValue); // line 39 area - incorrect SecretText retrieval` Error Codes:** AL0133
8	table-trigger-names	table-triggers	1	CG-AL-M010
Description: The model used invalid trigger names 'OnAfterInsert', 'OnAfterModify', and 'OnAfterDelete' on a table object. In AL, the valid table triggers are 'OnInsert', 'OnModify', 'OnDelete', and 'OnRename'. The 'OnAfter' variants are integration event names (e.g., OnAfterInsertEvent) that are published by the system but are not valid as trigger declarations within a table definition. The model confused integration/subscription event names with actual table trigger names. Correct Pattern: `Use the correct table trigger names: 'trigger OnInsert()', 'trigger OnModify()', 'trigger OnDelete()'. For cascade delete, implement 'trigger OnDelete()' in the Project table to delete related Project Task records.` Incorrect Pattern: `OnAfterInsert OnAfterModify OnAfterDelete` Error Codes:* AL0162
9	table-extension-structure	table-extension-definition	1	CG-AL-M006
Description: The model failed to generate valid AL code for the table extension. The compilation errors (AL0104 'Syntax error, } expected' and AL0198 'Expected one of the application object keywords') at line 64 indicate the generated code has fundamental structural/syntax issues - likely malformed table extension definition, misplaced procedures, or incorrect block nesting. The generated code was either empty, truncated, or structurally invalid. The model needed to produce a complete tableextension object with fields, procedures (UpdateRiskLevel, CalculatePaymentHistoryRating, GetCreditLimit, ValidateNewOrder, TriggerRiskAssessment), and proper validation triggers, but failed to generate syntactically valid AL code. Correct Pattern: tableextension 70001 "Advanced Customer Extension" extends Customer { fields { field(50100; "Credit Score"; Integer) { DataClassification = CustomerContent; trigger OnValidate() begin if ("Credit Score" < 300) or ("Credit Score" > 850) then Error('Credit Score must be between 300 and 850'); end; } field(50101; "Risk Level"; Option) { OptionMembers = Low,Medium,High,Critical; DataClassification = CustomerContent; } field(50102; "Last Risk Assessment Date"; Date) { DataClassification = CustomerContent; } field(50103; "Payment History Rating"; Decimal) { DataClassification = CustomerContent; } field(50104; "Preferred Payment Method"; Code[10]) { TableRelation = "Payment Method"; DataClassification = CustomerContent; } } procedure UpdateRiskLevel() begin case true of "Credit Score" >= 670: "Risk Level" := "Risk Level"::Low; "Credit Score" >= 580: "Risk Level" := "Risk Level"::Medium; "Credit Score" >= 500: "Risk Level" := "Risk Level"::High; else "Risk Level" := "Risk Level"::Critical; end; end; procedure CalculatePaymentHistoryRating(): Decimal begin exit(50.0); end; procedure GetCreditLimit(): Decimal begin case "Risk Level" of "Risk Level"::Low: exit(100000); "Risk Level"::Medium: exit(50000); "Risk Level"::High: exit(10000); "Risk Level"::Critical: exit(1000); end; end; procedure ValidateNewOrder(OrderAmount: Decimal): Boolean begin exit(OrderAmount <= GetCreditLimit()); end; procedure TriggerRiskAssessment() begin UpdateRiskLevel(); "Last Risk Assessment Date" := Today; Modify(); end; } Incorrect Pattern: `// Generated code not found - the code at line 64 had a syntax error suggesting malformed or truncated table extension structure` Error Codes: AL0104
10	table-field-property-context	table-definition	1	CG-AL-H005
Description: The model generated AL code for the two tables but incorrectly used the 'Caption' property in a context where it is not allowed (error AL0124). The compilation error at line 121 indicates the model placed a Caption property somewhere invalid—likely on a field or object in a way that AL does not permit. The model failed to produce syntactically correct table definitions. Additionally, the generated code was not captured ('Generated code not found'), suggesting the model may have produced malformed output that couldn't be properly extracted, but the compilation was still attempted and failed with the Caption property error. This is a model knowledge gap about proper AL table/field property usage. Correct Pattern: `Caption properties should only be used on table fields, pages, page fields, and other appropriate AL objects/elements. In table field definitions, Caption goes inside the field block. It should not be placed on properties like AutoIncrement or at the table level in certain contexts where it's not supported.` Incorrect Pattern: `Caption = '...'; // placed in an invalid context (line 121 of generated file)` Error Codes: AL0124
11	complex-report-with-helper-codeunit	report-definition-and-codeunit-generation	1	CG-AL-M007
Description: The model failed to generate any valid AL code at all. The task required creating both a Report 70001 'Sales Performance Analysis' and a helper codeunit 'CG-AL-M007 Mock Calculator' that the test file references. The generated code appears to be empty or syntactically invalid (the compilation error AL0104 'Syntax error, } expected' at line 11 and 'App generation failed' indicate the model produced malformed or incomplete AL code). The model needed to understand that the test file depends on two objects: (1) a Report object with Customer, Sales Header, and Sales Line data items, and (2) a Codeunit named 'CG-AL-M007 Mock Calculator' with methods like Initialize(), AddSalesLine(), GetRunningTotalByCustomer(), GetRunningTotalByRegion(), CalculateAverageOrderValue(), GetCustomerRank(), GetTopProduct(), GetProductSalesQuantity(), CalculateYoYComparison(), CalculateOrderFrequency(), GetTotalSales(), and GetCustomerCount(). The model either produced no code or fundamentally broken code. Correct Pattern: The model should have generated two objects: 1) report 70001 "Sales Performance Analysis" with DataItems for Customer, Sales Header ("Sales Header"), and Sales Line ("Sales Line"), with proper triggers, request page with date filters, ApplicationArea, and UsageCategory. 2) codeunit XXXXX "CG-AL-M007 Mock Calculator" with internal Dictionary-based storage implementing all required methods: - Initialize() - AddSalesLine(CustomerCode: Code[20]; Region: Code[20]; ItemCode: Code[20]; Quantity: Decimal; Amount: Decimal) - GetRunningTotalByCustomer(CustomerCode: Code[20]): Decimal - GetRunningTotalByRegion(Region: Code[20]): Decimal - CalculateAverageOrderValue(): Decimal - GetCustomerRank(CustomerCode: Code[20]): Integer - GetTopProduct(): Code[20] - GetProductSalesQuantity(ItemCode: Code[20]): Decimal - CalculateYoYComparison(CurrentYear: Decimal; PreviousYear: Decimal): Decimal - CalculateOrderFrequency(OrderCount: Integer; Days: Integer): Decimal - GetTotalSales(): Decimal - GetCustomerCount(): Integer Incorrect Pattern: `// Generated code not found (empty or fundamentally broken output)` Error Codes: AL0104
12	yaml-parsing-string-manipulation	text-parsing-and-json-handling	1	CG-AL-M021
Description: The model failed to generate valid AL code for the YAML Handler codeunit. The compilation error at line 41 (syntax error, ')' expected) indicates the model produced syntactically invalid AL code, likely misusing string manipulation functions or incorrectly structuring procedure parameters/expressions. The task requires implementing custom YAML parsing (splitting text by line feeds, extracting key-value pairs) and converting between YAML and JSON using AL's JsonObject type. The model apparently struggled with the AL syntax for text manipulation operations needed to parse YAML content, producing code with syntax errors. Since the generated code was not captured ('Generated code not found'), but the compilation errors clearly point to model-generated code issues (line 41 of the generated .al file), this is a model knowledge gap in writing correct AL syntax for string parsing and JsonObject manipulation. Correct Pattern: `Proper AL text parsing using TextBuilder, splitting by LF character (10), using JsonObject.Add/Replace for building JSON, and iterating JsonObject keys with proper AL syntax. For example: Line.Split(': ') to split YAML key-value pairs, JsonObj.Add(Key, Value) to build objects, and proper procedure signatures with Text/Boolean parameter types.` Incorrect Pattern: `// Line 41 contained syntax error - exact code not captured but involved incorrect expression syntax` Error Codes: AL0104
13	al-syntax-complex-codeunit	codeunit-procedure-implementation	1	CG-AL-M008
Description: The model generated AL code with syntax errors at line 122, column 49. The generated code was not captured/saved properly ('Generated code not found'), but the compilation errors (AL0104 syntax errors expecting ')', 'end', ';', and '}') indicate the model produced malformed AL code for the Purchase Approval Workflow codeunit. This is a complex codeunit requiring multiple procedures with various parameter types and return values, dictionary/list usage for in-memory tracking, and event subscribers. The model likely made a syntax error in a procedure body - possibly incorrect use of a complex expression, mismatched parentheses, or incorrect variable declaration syntax around line 122 of its output. Correct Pattern: `A properly structured codeunit 70003 'Purchase Approval Workflow' with correctly formed procedure signatures and bodies, using Dictionary or temporary tables for tracking approval state, with proper AL syntax for all expressions and statements.` Incorrect Pattern: `// Generated code not found - but line 122 col 49 had syntax error expecting ')' then 'end' then ';' then '}'` Error Codes: AL0104
14	no-series-auto-number-generation	table-definition-with-triggers	1	CG-AL-M003
Description: The model failed to generate any valid AL code for the Sales Contract table. The compilation errors (AL0185: Page '0' is missing) suggest the model produced a malformed or nearly empty table object that references non-existent pages (likely from incorrect LookupPageId/DrillDownPageId properties set to 0). The 'Generated code not found' note and the trivial compilation errors indicate the model either failed to produce output or produced a stub/skeleton that doesn't properly define the table with all required fields, triggers, and validation logic. The task requires knowledge of: AL table definition syntax, NoSeriesManagement for auto-generating Contract No., OnInsert/OnDelete trigger patterns, field validation triggers with Error() calls, Option fields, TableRelation properties, and proper page ID references (or omitting them when no page exists). Correct Pattern: table 70002 "Sales Contract" { Caption = 'Sales Contract'; DataClassification = CustomerContent; fields { field(1; "Contract No."; Code[20]) { Caption = 'Contract No.'; DataClassification = CustomerContent; } field(2; "Customer No."; Code[20]) { Caption = 'Customer No.'; TableRelation = Customer; DataClassification = CustomerContent; } field(3; "Start Date"; Date) { Caption = 'Start Date'; DataClassification = CustomerContent; } field(4; "End Date"; Date) { Caption = 'End Date'; DataClassification = CustomerContent; trigger OnValidate() begin if "End Date" <= "Start Date" then Error('End Date must be after Start Date'); end; } field(5; "Contract Value"; Decimal) { Caption = 'Contract Value'; DataClassification = CustomerContent; trigger OnValidate() begin if "Contract Value" <= 0 then Error('Contract Value must be positive'); end; } field(6; Status; Option) { Caption = 'Status'; OptionMembers = Draft,Active,Suspended,Terminated,Closed; DataClassification = CustomerContent; } field(7; "Payment Terms"; Code[10]) { Caption = 'Payment Terms'; TableRelation = "Payment Terms"; DataClassification = CustomerContent; } } keys { key(PK; "Contract No.") { Clustered = true; } } trigger OnInsert() begin if "Contract No." = '' then "Contract No." := GenerateContractNo(); Status := Status::Draft; end; trigger OnDelete() begin if Status = Status::Active then Error('Cannot delete active contract'); end; local procedure GenerateContractNo(): Code[20] begin // Auto-generate logic end; } Incorrect Pattern: `// Generated code not found - model produced invalid/empty output with Page '0' references` Error Codes: AL0185