Skip to main content

Coding & Software

SWE-smith

SWE-smith agent trajectories: per-(model x task-instance) outcomes from SWE-agent runs over synthesized software-engineering task instances. Each cell is binary resolved (the agent patch passes the instance tests, 1) or not (0), recorded under three trajectory-format splits (tool/xml/ticks).

13,472items
3subjects
35%observed
MITlicense
software_engineeringdomain
textmodality

Response matrix

Every model, scored item by item.

Each row is an AI model and each column an item, ordered so the strongest models and easiest items gather toward one corner. 3 subjects × 13,472 items, 35% of cells evaluated.

Fit to width. Hover for subject & item; click a cell for details.

SWE-smith response matrix: AI models (rows) against items (columns)
Correct (1)Incorrect (0)Unobserved

Scale: 1 = correct · 0 = incorrect

Sample items

What the questions look like — and how subjects answer.

A spread of items across the difficulty range, each shown with a few subjects' actual answers.

Item 10% solve rateanswer: diff --git a/src/jinja2/exceptions.py b/src/jinja2/exceptions.py index 082ebe8..ae88800 100644 --- a/src/jinja2/exceptions.py +++ b/src/jinja2/exceptions.py @@ -62,29 +62,20 @@ class TemplatesNotFound(TemplateNotFound): .. versionadded:: 2.2 """ - def __init__( - self, - names: t.Sequence[t.Union[str, "Undefined"]] = (), - message: t.Optional[str] = None, - ) -> None: - if message is None: - from .runtime import Undefined - - parts = [] - - for name in names: - if isinstance(name, Undefined): - parts.append(name._undefined_message) - else: - parts.append(name) - - parts_str = ", ".join(map(str, parts)) - message = f"none of the templates given were found: {parts_str}" - - super().__init__(names[-1] if names else None, message) + def __init__(self, names: t.Sequence[t.Union[str, 'Undefined']]=(), message: + t.Optional[str]=None) ->None: + if not names: + names = [] + + # Choose the first name for the parent class initialization + name = names[0] if names else None + + # Initialize the parent class + TemplateNotFound.__init__(self, name, message) + + # Override the templates attribute with all names self.templates = list(names) - class TemplateSyntaxError(TemplateError): """Raised to tell the user that there is a problem with the template."""

Bug: Unexpected Behavior with TemplateNotFound Exception

Description

After a recent update, I'm encountering unexpected behavior when trying to handle templates that are not found. It seems like the TemplatesNotFound exception is not providing the expected template name in certain scenarios, which is causing issues in error handling logic.

Steps to Reproduce

  1. Attempt to load a template that does not exist using the Jinja2 environment.
  2. Catch the TemplatesNotFound exception.
  3. Observe the template name provided in the exception message.

Expected Outcome

The exception should provide the name of the first template that was not found, allowing for proper error handling and logging.

Actual Outcome

The exception message does not consistently provide the expected template name, leading to confusion in identifying which template was missing.

Environment

  • Jinja2 Version: Latest
  • Python Version: 3.10.15
  • Operating System: Linux

This issue is causing some of our error handling mechanisms to fail, as they rely on the template name being correctly identified in the exception. Any insights or fixes would be greatly appreciated!

How subjects answered

  • claude-3-7-sonnet-20250219 incorrect

    diff --git a/pdfminer/cmapdb.py b/pdfminer/cmapdb.py index 87d9870..b66e1dc 100644 --- a/pdfminer/cmapdb.py +++ b/pdfminer/cmapdb.py @@ -162,7 +162,12 @@ class IdentityUnicodeMap(UnicodeMap): def get_unichr(self, cid: int) -> str: """Interpret character id as unicode codepoint""" log.debug("get_unichr: %r, %r", self, cid)

    • return chr(cid)
    • try:
    • return chr(cid)
    • except (ValueError, OverflowError):
    • # Handle cases where cid is not a valid Unicode code point
    • # Return a replacement character or a space
    • return "\ufffd"

    class FileCMap(CMap): diff --git a/pdfminer/pdfcolor.py …

Item 20% solve rateanswer: diff --git a/sunpy/coordinates/frameattributes.py b/sunpy/coordinates/frameattributes.py index 06871fc..2a439d3 100644 --- a/sunpy/coordinates/frameattributes.py +++ b/sunpy/coordinates/frameattributes.py @@ -58,24 +58,23 @@ class TimeFrameAttributeSunPy(TimeAttribute): converted = False elif isinstance(value, str): - if value == 'now': - return Time(datetime.datetime.now()), True - try: - out = Time(parse_time(value)) + out = Time(value) except Exception as err: raise ValueError(f'Invalid time input {self.name}={value!r}\n{err}') converted = True else: + if value == 'now': + return Time(datetime.datetime.now()), True + try: - out = Time(value) + out = Time(parse_time(value)) except Exception as err: raise ValueError(f'Invalid time input {self.name}={value!r}\n{err}') converted = True return out, converted - class ObserverCoordinateAttribute(CoordinateAttribute): """ An Attribute to describe the location of the observer in the solar system. @@ -121,10 +120,8 @@ class ObserverCoordinateAttribute(CoordinateAttribute): return result, converted def _convert_string_to_coord(self, out, obstime): - """ - Given a value and and frame instance calculate the position of the - object given as a string. - """ + + return obscoord # Import here to prevent circular import from .ephemeris import get_body_heliographic_stonyhurst @@ -135,9 +132,10 @@ class ObserverCoordinateAttribute(CoordinateAttribute): rep = obscoord.spherical rep.lon[()] = 0*u.deg obscoord = obscoord.realize_frame(rep) - - return obscoord - + """ + Given a value and and frame instance calculate the position of the + object given as a string. + """ def __get__(self, instance, frame_cls=None): # If instance is None then we can't get obstime so it doesn't matter. if instance is not None: @@ -148,10 +146,10 @@ class ObserverCoordinateAttribute(CoordinateAttribute): # the position of the observer. if isinstance(observer, str): if observer != "self" and obstime is not None: + return observer + else: new_observer = self._convert_string_to_coord(observer.lower(), obstime) new_observer.object_name = observer setattr(instance, '_' + self.name, new_observer) - else: - return observer - return super().__get__(instance, frame_cls=frame_cls) + return super().__get__(instance, frame_cls=frame_cls) \ No newline at end of file

String time formats not being parsed correctly in TimeFrameAttributeSunPy

What happened?

When trying to use string time formats in SunPy coordinates, I'm getting ValueError exceptions. The code is not properly handling string time formats that should be valid.

from sunpy.coordinates import frames
from astropy.time import Time

# This fails with ValueError
frame = frames.Helioprojective(obstime='2012/01/01T00:00:00')

Also, using 'now' as a time string doesn't work:

# This fails with ValueError: Invalid time input <unbound>='now'
from sunpy.coordinates.frameattributes import TimeFrameAttributeSunPy
attr = TimeFrameAttributeSunPy()
attr.convert_input('now')

What did you expect to happen?

I expected these string time formats to be properly parsed. SunPy should be able to handle various string time formats like '2012/01/01T00:00:00', '20120101000000', and '2012/01/01 00:00:00' through its parse_time utility. Also, the special string 'now' should be recognized and converted to the current time.

Minimal Complete Verifiable Example

from sunpy.coordinates import frames
from astropy.time import Time

# These all fail with ValueError
frame1 = frames.Helioprojective(obstime='2012/01/01T00:00:00')
frame2 = frames.Helioprojective(obstime='20120101000000')
frame3 = frames.Helioprojective(obstime='2012/01/01 00:00:00')

# This special case also fails
from sunpy.coordinates.frameattributes import TimeFrameAttributeSunPy
attr = TimeFrameAttributeSunPy()
result, converted = attr.convert_input('now')

Environment

  • SunPy version: latest
  • Python version: 3.11
  • Operating system: Linux

Additional Information

The issue appears to be in the TimeFrameAttributeSunPy class in sunpy/coordinates/frameattributes.py. The code that handles string time inputs seems to be incorrectly processing these formats. The special case for 'now' also doesn't seem to be working properly.

How subjects answered

  • claude-3-7-sonnet-20250219 incorrect

    diff --git a/src/marshmallow/utils.py b/src/marshmallow/utils.py index a44b533..332e7db 100644 --- a/src/marshmallow/utils.py +++ b/src/marshmallow/utils.py @@ -192,11 +192,18 @@ def from_iso_date(value):

    def from_timestamp(value: typing.Any) -> dt.datetime:

    • if value < 0:
    • raise ValueError("Not a valid POSIX timestamp")
    • value = float(value)
    • Check for boolean values before converting to float

      if value is True or value is False: raise ValueError("Not a valid POSIX timestamp")
    • Try to convert to float

    • try:
    • value = float(value)
    • except (TypeError, ValueError) as exc:
    • raise ValueError("Not a valid POSIX …
Item 30% solve rate

Issue Text

Inconsistent Character Handling in TooManySymbolOrPunctuationPlugin

Description

The TooManySymbolOrPunctuationPlugin class in charset_normalizer/md.py is not correctly handling certain character conditions, leading to inaccurate symbol and punctuation counts.

Steps/Code to Reproduce

The issue arises when processing text with repeated characters or common ASCII characters. The logic intended to differentiate between symbols, punctuation, and other characters seems to be flawed.

Expected Results

The plugin should accurately count symbols and punctuation based on the character's properties and its relationship to the last processed character. Specifically, it should increment the symbol count only when a character is a symbol and not a digit or emoticon, and the punctuation count when a character is a punctuation mark.

Actual Results

The current implementation incorrectly increments the symbol count under certain conditions, such as when a character is not a digit or emoticon, but is a symbol. Additionally, the punctuation count logic does not account for repeated characters or common ASCII characters correctly.

Possible Causes

  • The condition checking for the last printable character and common ASCII characters might be incorrectly structured, leading to unexpected behavior.
  • The logic for determining when to increment the symbol and punctuation counts may not be correctly distinguishing between different character types.

This issue affects the accuracy of the plugin's detection capabilities, potentially leading to false positives or negatives in symbol and punctuation detection.

How subjects answered

  • claude-3-5-sonnet-20241022 incorrect

    diff --git a/cloudpickle/cloudpickle.py b/cloudpickle/cloudpickle.py index 02b2127..8b02a75 100644 --- a/cloudpickle/cloudpickle.py +++ b/cloudpickle/cloudpickle.py @@ -1086,7 +1086,7 @@ def _dict_items_reduce(obj):

    def _odict_keys_reduce(obj):

    • return _make_dict_keys, (list(obj)[::-1], False)
    • return _make_dict_keys, (list(obj), False)

    def _odict_values_reduce(obj): diff --git a/reproduce_error.py b/reproduce_error.py new file mode 100644 index 0000000..6e4350f --- /dev/null +++ b/reproduce_error.py @@ -0,0 +1,25 @@ +from collections import OrderedDict +import cloudpickle +import pickle + +# Create an ordered dictionary with some keys +od = OrderedDict() +od['a'] = …

Item 40% solve rate

Multiple issues with Flask's context management

Description

I've discovered several critical issues with Flask's context management system that cause applications to crash or behave unexpectedly. The problems appear to be in the ctx.py file and affect the following functionality:

  1. The _AppCtxGlobals.pop() method ignores the provided default value and always returns None when the key doesn't exist

  2. The after_this_request function has inverted logic - it raises an error when a request context IS active (instead of when it's NOT active)

  3. The AppContext.pop() method has several issues:

    • It incorrectly checks for token length
    • It ignores the exception parameter
    • It always passes None to do_teardown_appcontext
    • It pops from the beginning of the token list instead of the end
    • It has an inverted assertion check
    • It sends the wrong parameters to appcontext_popped.send
  4. The RequestContext class initializes _cv_tokens as a tuple instead of a list, causing AttributeError when methods try to append to it

Steps to Reproduce

Here's a simple example that demonstrates one of the issues:

from flask import Flask, g, after_this_request

app = Flask(__name__)

@app.route('/')
def index():
    # This will raise an error with the inverted logic
    @after_this_request
    def add_header(response):
        response.headers['X-Custom'] = 'Value'
        return response
    
    # This will return None instead of the default value
    value = g.pop('non_existent_key', 'default_value')
    return f"Value: {value}"

if __name__ == '__main__':
    app.run(debug=True)

When running this application and accessing the root URL, you'll get a RuntimeError because the after_this_request function has inverted logic. Additionally, the g.pop() method will return None instead of the provided default value.

The issues with AppContext.pop() and RequestContext._cv_tokens cause more severe problems that break most Flask functionality, including request handling, session management, and template rendering.

How subjects answered

  • claude-3-7-sonnet-20250219 incorrect

    diff --git a/stackprinter/pycache/init.cpython-310.pyc b/stackprinter/pycache/init.cpython-310.pyc new file mode 100644 index 0000000..6f45673 Binary files /dev/null and b/stackprinter/pycache/init.cpython-310.pyc differ diff --git a/stackprinter/pycache/colorschemes.cpython-310.pyc b/stackprinter/pycache/colorschemes.cpython-310.pyc new file mode 100644 index 0000000..54c5343 Binary files /dev/null and b/stackprinter/pycache/colorschemes.cpython-310.pyc differ diff --git a/stackprinter/pycache/extraction.cpython-310.pyc b/stackprinter/pycache/extraction.cpython-310.pyc new file mode 100644 index 0000000..af8ee96 Binary files /dev/null and …

Item 50% solve rateanswer: diff --git a/src/docx/oxml/shape.py b/src/docx/oxml/shape.py index 289d355..a19b5fd 100644 --- a/src/docx/oxml/shape.py +++ b/src/docx/oxml/shape.py @@ -150,12 +150,12 @@ class CT_Picture(BaseOxmlElement): required to define a viable picture element, based on the values passed as parameters.""" pic = parse_xml(cls._pic_xml()) - pic.nvPicPr.cNvPr.id = pic_id + pic.nvPicPr.cNvPr.id = rId pic.nvPicPr.cNvPr.name = filename - pic.blipFill.blip.embed = rId - pic.spPr.cx = cx - pic.spPr.cy = cy - return pic + pic.blipFill.blip.embed = pic_id + pic.spPr.cy = cx + pic.spPr.cx = cy + return None @classmethod def _pic_xml(cls): diff --git a/src/docx/oxml/styles.py b/src/docx/oxml/styles.py index fb0e5d0..d5e306e 100644 --- a/src/docx/oxml/styles.py +++ b/src/docx/oxml/styles.py @@ -290,15 +290,14 @@ class CT_Styles(BaseOxmlElement): return style def default_for(self, style_type): - """Return `w:style[@w:type="*{style_type}*][-1]` or |None| if not found.""" + # spec calls for last default in document order + return default_styles_for_type[-1] + if not default_styles_for_type: + return None default_styles_for_type = [ s for s in self._iter_styles() if s.type == style_type and s.default ] - if not default_styles_for_type: - return None - # spec calls for last default in document order - return default_styles_for_type[-1] - + """Return `w:style[@w:type="*{style_type}*][-1]` or |None| if not found.""" def get_by_id(self, styleId: str) -> CT_Style | None: """`w:style` child where @styleId = `styleId`.

Images not displaying in Word documents after adding them

I'm having an issue with adding images to Word documents using python-docx. When I try to add an image to a document, the image doesn't appear in the document at all.

Steps to reproduce

  1. Create a new Word document
  2. Try to add an image using the new_pic_inline method
  3. The image doesn't appear in the document

Example code

from docx import Document

document = Document()
paragraph = document.add_paragraph()
run = paragraph.add_run()
# Add an image to the run
run.add_picture('my_image.png', width=100, height=200)
document.save('document_with_image.docx')

When I open the saved document, the image is missing completely.

Expected behavior

The image should appear in the document with the specified dimensions.

Actual behavior

No image appears in the document at all.

I also noticed that when I try to get the default style for a document, it's not working correctly. For example:

default_style = document.styles.default('paragraph')

This doesn't return the expected default style.

I'm using python-docx version 0.8.11 on Windows 10.

How subjects answered

  • claude-3-7-sonnet-20250219 incorrect
Item 60% solve rateanswer: diff --git a/monai/networks/nets/resnet.py b/monai/networks/nets/resnet.py index d627224..e3ce66b 100644 --- a/monai/networks/nets/resnet.py +++ b/monai/networks/nets/resnet.py @@ -192,9 +192,6 @@ class ResNet(nn.Module): Args: block: which ResNet block to use, either Basic or Bottleneck. - ResNet block class or str. - for Basic: ResNetBlock or 'basic' - for Bottleneck: ResNetBottleneck or 'bottleneck' layers: how many layers to use. block_inplanes: determine the size of planes at each step. Also tunable with widen_factor. spatial_dims: number of spatial dimensions of the input image. @@ -216,9 +213,9 @@ class ResNet(nn.Module): def __init__( self, - block: type[ResNetBlock | ResNetBottleneck] | str, - layers: list[int], - block_inplanes: list[int], + block: Type[Union[ResNetBlock, ResNetBottleneck]], + layers: List[int], + block_inplanes: List[int], spatial_dims: int = 3, n_input_channels: int = 3, conv1_t_size: tuple[int] | int = 7, @@ -231,22 +228,16 @@ class ResNet(nn.Module): bias_downsample: bool = True, # for backwards compatibility (also see PR #5477) act: str | tuple = ("relu", {"inplace": True}), norm: str | tuple = "batch", + n_classes: int | None = None, ) -> None: super().__init__() - if isinstance(block, str): - if block == "basic": - block = ResNetBlock - elif block == "bottleneck": - block = ResNetBottleneck - else: - raise ValueError("Unknown block '%s', use basic or bottleneck" % block) + if n_classes is not None and num_classes == 400: + num_classes = n_classes - conv_type: type[nn.Conv1d | nn.Conv2d | nn.Conv3d] = Conv[Conv.CONV, spatial_dims] - pool_type: type[nn.MaxPool1d | nn.MaxPool2d | nn.MaxPool3d] = Pool[Pool.MAX, spatial_dims] - avgp_type: type[nn.AdaptiveAvgPool1d | nn.AdaptiveAvgPool2d | nn.AdaptiveAvgPool3d] = Pool[ - Pool.ADAPTIVEAVG, spatial_dims - ] + conv_type: Type[Union[nn.Conv1d, nn.Conv2d, nn.Conv3d]] = Conv[Conv.CONV, spatial_dims] + norm_type: Type[Union[nn.BatchNorm1d, nn.BatchNorm2d, nn.BatchNorm3d]] = Norm[Norm.BATCH, spatial_dims] + pool_type: Type[Union[nn.MaxPool1d, nn.MaxPool2d, nn.MaxPool3d]] = Pool[Pool.MAX, spatial_dims] block_avgpool = get_avgpool() block_inplanes = [int(x * widen_factor) for x in block_inplanes] @@ -295,7 +286,7 @@ class ResNet(nn.Module): def _make_layer( self, - block: type[ResNetBlock | ResNetBottleneck], + block: Type[Union[ResNetBlock, ResNetBottleneck]], planes: int, blocks: int, spatial_dims: int, @@ -480,9 +471,9 @@ class ResNetEncoder(ResNetFeatures, BaseEncoder): def _resnet( arch: str, - block: type[ResNetBlock | ResNetBottleneck], - layers: list[int], - block_inplanes: list[int], + block: Type[Union[ResNetBlock, ResNetBottleneck]], + layers: List[int], + block_inplanes: List[int], pretrained: bool | str, progress: bool, **kwargs: Any, @@ -695,4 +686,4 @@ def _load_state_dict(model: nn.Module, model_name: str, datasets23: bool = True) model_state_dict = get_pretrained_resnet_medicalnet(resnet_depth, device="cpu", datasets23=datasets23) model_state_dict = {key.replace("module.", ""): value for key, value in model_state_dict.items()} - model.load_state_dict(model_state_dict) + model.load_state_dict(model_state_dict) \ No newline at end of file diff --git a/monai/networks/nets/senet.py b/monai/networks/nets/senet.py index 51435a9..a5aea58 100644 --- a/monai/networks/nets/senet.py +++ b/monai/networks/nets/senet.py @@ -14,7 +14,7 @@ from __future__ import annotations import re from collections import OrderedDict from collections.abc import Sequence -from typing import Any +from typing import Any, Type, Union import torch import torch.nn as nn @@ -56,15 +56,13 @@ class SENet(nn.Module): Args: spatial_dims: spatial dimension of the input data. in_channels: channel number of the input data. - block: SEBlock class or str. - for SENet154: SEBottleneck or 'se_bottleneck' - for SE-ResNet models: SEResNetBottleneck or 'se_resnet_bottleneck' - for SE-ResNeXt models: SEResNeXtBottleneck or 'se_resnetxt_bottleneck' + block: SEBlock class. + for SENet154: SEBottleneck + for SE-ResNet models: SEResNetBottleneck + for SE-ResNeXt models: SEResNeXtBottleneck layers: number of residual blocks for 4 layers of the network (layer1...layer4). groups: number of groups for the 3x3 convolution in each bottleneck block. for SENet154: 64 - for SE-ResNet models: 1 - for SE-ResNeXt models: 32 reduction: reduction ratio for Squeeze-and-Excitation modules. for all models: 16 dropout_prob: drop probability for the Dropout layer. @@ -97,7 +95,7 @@ class SENet(nn.Module): self, spatial_dims: int, in_channels: int, - block: type[SEBottleneck | SEResNetBottleneck | SEResNeXtBottleneck] | str, + block: Type[Union[SEBottleneck, SEResNetBottleneck, SEResNeXtBottleneck]], layers: Sequence[int], groups: int, reduction: int, @@ -110,18 +108,6 @@ class SENet(nn.Module): ) -> None: super().__init__() - if isinstance(block, str): - if block == "se_bottleneck": - block = SEBottleneck - elif block == "se_resnet_bottleneck": - block = SEResNetBottleneck - elif block == "se_resnetxt_bottleneck": - block = SEResNeXtBottleneck - else: - raise ValueError( - "Unknown block '%s', use se_bottleneck, se_resnet_bottleneck or se_resnetxt_bottleneck" % block - ) - relu_type: type[nn.ReLU] = Act[Act.RELU] conv_type: type[nn.Conv1d | nn.Conv2d | nn.Conv3d] = Conv[Conv.CONV, spatial_dims] pool_type: type[nn.MaxPool1d | nn.MaxPool2d | nn.MaxPool3d] = Pool[Pool.MAX, spatial_dims] @@ -213,7 +199,7 @@ class SENet(nn.Module): def _make_layer( self, - block: type[SEBottleneck | SEResNetBottleneck | SEResNeXtBottleneck], + block: Type[SEBottleneck | SEResNetBottleneck | SEResNeXtBottleneck], planes: int, blocks: int, groups: int, @@ -521,4 +507,4 @@ SEresnet50 = Seresnet50 = seresnet50 = SEResNet50 SEresnet101 = Seresnet101 = seresnet101 = SEResNet101 SEresnet152 = Seresnet152 = seresnet152 = SEResNet152 SEResNeXt50 = SEresnext50 = Seresnext50 = seresnext50 = SEResNext50 -SEResNeXt101 = SEresnext101 = Seresnext101 = seresnext101 = SEResNext101 +SEResNeXt101 = SEresnext101 = Seresnext101 = seresnext101 = SEResNext101 \ No newline at end of file

support block type strings in resnet Is your feature request related to a problem? Please describe. from https://github.com/Project-MONAI/MONAI/discussions/4678, the resnet input args should be able to understand strings in addition to the class types, the corresponding docstrings should be revised https://github.com/Project-MONAI/MONAI/blob/d99857cb49aab464ac38f7926ce4c6f299417721/monai/networks/nets/resnet.py#L173-L175

How subjects answered

  • claude-3-5-sonnet-20241022 incorrect

    diff --git a/reproduce_bugs.py b/reproduce_bugs.py new file mode 100644 index 0000000..dcb93a8 --- /dev/null +++ b/reproduce_bugs.py @@ -0,0 +1,27 @@ +from packaging.requirements import Requirement +from pipdeptree._models.package import ReqPackage + +def test_is_conflicting():

    • Test case 1: Package with no version specifier should not be conflicting when missing

    • req = Requirement("some-missing-package")
    • req_pkg = ReqPackage(req)
    • print("Test 1: Package with no version specifier is missing")
    • print(f"is_missing: {req_pkg.is_missing}") # Should be True
    • print(f"is_conflicting: {req_pkg.is_conflicting()}") # Should be False since any version would be acceptable
Item 750% solve rateanswer: diff --git a/src/tomli/_parser.py b/src/tomli/_parser.py index 16c76cd..d787f4f 100644 --- a/src/tomli/_parser.py +++ b/src/tomli/_parser.py @@ -313,17 +313,17 @@ def skip_until( error_on_eof: bool, ) -> Pos: try: - new_pos = src.index(expect, pos) + new_pos = src.index(expect, pos + 1) except ValueError: new_pos = len(src) - if error_on_eof: - raise TOMLDecodeError(f"Expected {expect!r}", src, new_pos) from None + if not error_on_eof: + raise TOMLDecodeError(f"Unexpected end of file", src, new_pos) from None - if not error_on.isdisjoint(src[pos:new_pos]): + if error_on.issubset(src[pos:new_pos]): while src[pos] not in error_on: pos += 1 - raise TOMLDecodeError(f"Found invalid character {src[pos]!r}", src, pos) - return new_pos + return pos + return len(src) def skip_comment(src: str, pos: Pos) -> Pos: diff --git a/src/tomli/_re.py b/src/tomli/_re.py index 9eacefc..7e7e0e0 100644 --- a/src/tomli/_re.py +++ b/src/tomli/_re.py @@ -89,11 +89,11 @@ def match_to_datetime(match: re.Match) -> datetime | date: # 24 (hours) * 60 (minutes) * 2 (offset direction) = 2880. @lru_cache(maxsize=None) def cached_tz(hour_str: str, minute_str: str, sign_str: str) -> timezone: - sign = 1 if sign_str == "+" else -1 + sign = -1 if sign_str == "+" else 1 return timezone( timedelta( - hours=sign * int(hour_str), - minutes=sign * int(minute_str), + hours=sign * int(minute_str), + minutes=sign * int(hour_str), ) )

TOML parsing broken for timezone offsets and string handling

In [1]: import tomli
   ...: doc = '\n[bliibaa.diibaa]\noffsettime=[1979-05-27T00:32:00.999999-07:00]\n'
   ...: tomli.loads(doc)

ValueError: Expected datetime.datetime object, got {'bliibaa': {'diibaa': {'offsettime': [datetime.datetime(1979, 5, 27, 0, 32, 0, 999999, tzinfo=datetime.timezone(datetime.timedelta(hours=7)))]}}}

There are two major issues with the TOML parser:

  1. Timezone offsets are being calculated incorrectly - a negative offset is being treated as positive and vice versa. In the example above, -07:00 is being interpreted as +07:00.

  2. String parsing is broken, especially for multiline strings. For example:

In [2]: tomli.loads("v = '\n'")
TOMLDecodeError: Found invalid character '\n'

Additionally, the parser seems to have issues with the skip_until function which affects various parsing scenarios. This is causing valid TOML files to fail parsing and some invalid files to be accepted.

How subjects answered

  • claude-3-7-sonnet-20250219 correct
Item 8100% solve rateanswer: diff --git a/patsy/build.py b/patsy/build.py index cbe2524..2b41c2c 100644 --- a/patsy/build.py +++ b/patsy/build.py @@ -282,6 +282,12 @@ def _build_subterm(subterm, factor_infos, factor_values, out): ): for factor, column_idx in zip(subterm.factors, column_idxs): if factor_infos[factor].type == "categorical": + assert factor_infos[factor].type == "numerical" + assert ( + factor_values[factor].shape[1] == factor_infos[factor].num_columns + ) + out[:, i] *= factor_values[factor][:, column_idx] + else: contrast = subterm.contrast_matrices[factor] if np.any(factor_values[factor] < 0): raise PatsyError( @@ -289,13 +295,6 @@ def _build_subterm(subterm, factor_infos, factor_values, out): factor, ) out[:, i] *= contrast.matrix[factor_values[factor], column_idx] - else: - assert factor_infos[factor].type == "numerical" - assert ( - factor_values[factor].shape[1] == factor_infos[factor].num_columns - ) - out[:, i] *= factor_values[factor][:, column_idx] - def test__subterm_column_names_iter_and__build_subterm(): import pytest

Categorical factor handling broken after recent changes

Description

After some recent changes, categorical factors are being processed incorrectly. The code seems to be treating categorical factors as numerical factors, which causes assertion errors and incorrect behavior.

Steps/Code to Reproduce

import patsy
import numpy as np

# Simple categorical factor example
data = {'x': ['a', 'b', 'c', 'a', 'b'], 'y': [1, 2, 3, 4, 5]}
formula = 'y ~ C(x)'

# This should work but now fails
try:
    result = patsy.dmatrix(formula, data)
    print("Success:", result)
except Exception as e:
    print("Error:", e)

The error occurs because the code is incorrectly asserting that categorical factors should be numerical:

AssertionError: assert 'categorical' == 'numerical'

Expected Behavior

Categorical factors should be handled properly with contrast matrices, not treated as numerical factors. The formula y ~ C(x) should create appropriate dummy variables for the categorical levels.

Actual Behavior

The code fails with assertion errors because it's trying to process categorical factors using the numerical factor code path, which includes assertions that the factor type should be "numerical".

How subjects answered

  • claude-3-7-sonnet-20250219 correct

    diff --git a/patsy/design_info.py b/patsy/design_info.py index 24faa9b..137b206 100644 --- a/patsy/design_info.py +++ b/patsy/design_info.py @@ -326,8 +326,7 @@ class DesignInfo(object): cat_factors = set() for factor in subterm.factors: fi = self.factor_infos[factor]

    • if fi.type == "numerical":
    • assert fi.type == "categorical"
    •                    if fi.type == "categorical":
                             cm = subterm.contrast_matrices[factor].matrix
                             if cm.shape[0] != len(fi.categories):
                                 raise ValueError(

    @@ …

Item 9100% solve rateanswer: diff --git a/feedparser/util.py b/feedparser/util.py index ded2539..f3d96c8 100644 --- a/feedparser/util.py +++ b/feedparser/util.py @@ -52,67 +52,35 @@ class FeedParserDict(dict): """ :return: A :class:`FeedParserDict`. """ - - if key == "category": - try: - return dict.__getitem__(self, "tags")[0]["term"] - except IndexError: - raise KeyError("object doesn't have key 'category'") - elif key == "enclosures": - return [ - FeedParserDict( - [(name, value) for (name, value) in link.items() if name != "rel"] - ) - for link in dict.__getitem__(self, "links") - if link["rel"] == "enclosure" - ] - elif key == "license": - for link in dict.__getitem__(self, "links"): - if link["rel"] == "license" and "href" in link: - return link["href"] - elif key == "updated": - # Temporarily help developers out by keeping the old - # broken behavior that was reported in issue 310. - # This fix was proposed in issue 328. - if not dict.__contains__(self, "updated") and dict.__contains__( - self, "published" - ): - warnings.warn( - "To avoid breaking existing software while " - "fixing issue 310, a temporary mapping has been created " - "from `updated` to `published` if `updated` doesn't " - "exist. This fallback will be removed in a future version " - "of feedparser.", - DeprecationWarning, - stacklevel=_stacklevel, - ) - return dict.__getitem__(self, "published") - return dict.__getitem__(self, "updated") - elif key == "updated_parsed": - if not dict.__contains__(self, "updated_parsed") and dict.__contains__( - self, "published_parsed" - ): - warnings.warn( - "To avoid breaking existing software while " - "fixing issue 310, a temporary mapping has been created " - "from `updated_parsed` to `published_parsed` if " - "`updated_parsed` doesn't exist. This fallback will be " - "removed in a future version of feedparser.", - DeprecationWarning, - stacklevel=_stacklevel, - ) - return dict.__getitem__(self, "published_parsed") - return dict.__getitem__(self, "updated_parsed") - else: - realkey = self.keymap.get(key, key) - if isinstance(realkey, list): - for k in realkey: - if dict.__contains__(self, k): - return dict.__getitem__(self, k) - elif dict.__contains__(self, realkey): - return dict.__getitem__(self, realkey) - return dict.__getitem__(self, key) - + if key in self.keymap: + # Key is deprecated or has alternatives + mapped_keys = self.keymap[key] + if isinstance(mapped_keys, list): + # Try each alternative key + for mapped_key in mapped_keys: + if mapped_key in self: + warnings.warn( + f"{key!r} is deprecated, use {mapped_key!r} instead", + DeprecationWarning, + stacklevel=_stacklevel, + ) + return dict.__getitem__(self, mapped_key) + else: + # Single mapped key + if mapped_keys in self: + warnings.warn( + f"{key!r} is deprecated, use {mapped_keys!r} instead", + DeprecationWarning, + stacklevel=_stacklevel, + ) + return dict.__getitem__(self, mapped_keys) + + # Try direct access + if dict.__contains__(self, key): + return dict.__getitem__(self, key) + + # Key not found + raise KeyError(key) def __contains__(self, key): if key in ("updated", "updated_parsed"): # Temporarily help developers out by keeping the old

KeyError raised when accessing deprecated keys in FeedParserDict

Description

After updating to the latest version, I've encountered an issue where accessing certain keys in the FeedParserDict raises a KeyError. This seems to happen with keys that have been deprecated and are supposed to have alternative mappings.

For example, when trying to access the 'license' key, which should map to an alternative key if available, a KeyError is raised instead of returning the expected value.

Steps to reproduce:

  1. Create a FeedParserDict instance with a 'links' key containing a dictionary with 'rel' set to 'license' and 'href' set to a URL.
  2. Attempt to access the 'license' key on the FeedParserDict instance.
  3. Observe that a KeyError is raised instead of returning the 'href' value.

This behavior is unexpected and breaks the functionality that was previously working.

How subjects answered

  • claude-3-7-sonnet-20250219 correct

    diff --git a/tornado/websocket.py b/tornado/websocket.py index a1c1eca..d539816 100644 --- a/tornado/websocket.py +++ b/tornado/websocket.py @@ -1291,12 +1291,14 @@ class WebSocketProtocol13(WebSocketProtocol): timeout = self.params.ping_timeout if timeout is not None: return timeout

    • assert self.ping_interval is not None
    • if self.ping_interval is None:
    •        return 30
         return max(3 * self.ping_interval, 30)

      def start_pinging(self) -> None: """Start sending periodic pings to keep the connection alive"""

    • assert self.ping_interval is not None
    • if self.ping_interval is None:
    • return …
Item 10100% solve rateanswer: diff --git a/funcy/flow.py b/funcy/flow.py index 85c5a8e..3879189 100644 --- a/funcy/flow.py +++ b/funcy/flow.py @@ -113,14 +113,14 @@ def retry(call, tries, errors=Exception, timeout=0, filter_errors=None): def fallback(*approaches): """Tries several approaches until one works. Each approach has a form of (callable, expected_errors).""" - for approach in approaches: - func, catch = (approach, Exception) if callable(approach) else approach - catch = _ensure_exceptable(catch) + last_error = None + for func, errors in approaches: try: return func() - except catch: - pass - + except _ensure_exceptable(errors) as e: + last_error = e + if last_error is not None: + raise last_error def _ensure_exceptable(errors): """Ensures that errors are passable to except clause. I.e. should be BaseException subclass or a tuple."""

fallback() function doesn't handle callable-only approaches

Bug description

The fallback() function in flow.py doesn't properly handle the case when an approach is just a callable without explicit error types. According to the docstring, each approach should have a form of (callable, expected_errors), but the function should also accept just a callable as an approach, defaulting to catching all exceptions.

How to reproduce

from funcy.flow import fallback

def raiser():
    raise Exception("Error")

# This should work but fails
result = fallback(raiser, lambda: 1)

Current behavior

When passing a callable directly as an approach (without wrapping it in a tuple with error types), the function fails with:

TypeError: cannot unpack non-iterable function object

Expected behavior

The function should accept both forms:

  • (callable, expected_errors) tuple
  • Just a callable, which should default to catching all exceptions

The function should try each approach in order, and return the result of the first one that doesn't raise an exception. If all approaches fail, it should re-raise the last exception.

Version information

funcy 1.17.0

How subjects answered

  • claude-3-7-sonnet-20250219 correct

    diff --git a/src/jinja2/parser.py b/src/jinja2/parser.py index e4c5adc..3682fec 100644 --- a/src/jinja2/parser.py +++ b/src/jinja2/parser.py @@ -71,6 +71,7 @@ class Parser: self.extensions[tag] = extension.parse self._last_identifier = 0 self._tag_stack: t.List[str] = []

    •    self._tag_stack_lineno: t.List[int] = []  # Store line numbers of opening tags
         self._end_token_stack: t.List[t.Tuple[str, ...]] = []

      def fail( @@ -92,10 +93,57 @@ class Parser: """Fail with an unknown tag or unexpected end of template error.""" if end_token_stack: expected = " or ".join(repr(x) for x in sorted(sum(end_token_stack, …

Item 11100% solve rateanswer: diff --git a/deepdiff/path.py b/deepdiff/path.py index ee63b5b..b33c00f 100644 --- a/deepdiff/path.py +++ b/deepdiff/path.py @@ -52,8 +52,6 @@ def _path_to_elements(path, root_element=DEFAULT_FIRST_ELEMENT): if isinstance(path, (tuple, list)): return path elements = [] - if root_element: - elements.append(root_element) elem = '' inside = False prev_char = None @@ -62,59 +60,11 @@ def _path_to_elements(path, root_element=DEFAULT_FIRST_ELEMENT): inside_quotes = False quote_used = '' for char in path: - if prev_char == '𝆺𝅥𝅯': - elem += char - elif char in {'"', "'"}: - elem += char - # If we are inside and the quote is not what we expected, the quote is not closing - if not(inside_quotes and quote_used != char): - inside_quotes = not inside_quotes - if inside_quotes: - quote_used = char - else: - _add_to_elements(elements, elem, inside) - elem = '' - quote_used = '' - elif inside_quotes: - elem += char - elif char == '[': - if inside == '.': - _add_to_elements(elements, elem, inside) - inside = '[' - elem = '' - # we are already inside. The bracket is a part of the word. - elif inside == '[': - elem += char - else: - inside = '[' - brackets.append('[') - elem = '' - elif char == '.': - if inside == '[': - elem += char - elif inside == '.': - _add_to_elements(elements, elem, inside) - elem = '' - else: - inside = '.' - elem = '' - elif char == ']': - if brackets and brackets[-1] == '[': - brackets.pop() - if brackets: - elem += char - else: - _add_to_elements(elements, elem, inside) - elem = '' - inside = False - else: - elem += char prev_char = char if elem: _add_to_elements(elements, elem, inside) return tuple(elements) - def _get_nested_obj(obj, elements, next_element=None): for (elem, action) in elements: if action == GET:

Path parsing broken after recent changes

Description

Path parsing is completely broken and returns empty tuples for all path strings. This affects delta operations and path-based functionality throughout the library.

Steps/Code to Reproduce

from deepdiff.path import _path_to_elements

# Simple attribute access
result = _path_to_elements('root.hello', root_element=None)
print(f"Expected: [('hello', GETATTR)], Got: {result}")

# Array indexing
result = _path_to_elements("root[4]['b'][3]", root_element=None)
print(f"Expected: [(4, GET), ('b', GET), (3, GET)], Got: {result}")

# Mixed access
result = _path_to_elements('root[4].b[3]', root_element=None)
print(f"Expected: [(4, GET), ('b', GETATTR), (3, GET)], Got: {result}")

All of these return empty tuples () instead of the expected parsed path elements.

This also breaks delta operations:

from deepdiff import DeepDiff, Delta

t1 = {'field1': {'joe': 'Joe'}}
t2 = {'field1': {'joe': 'Joe'}, 'field2': {'James'}}
diff = DeepDiff(t1, t2)
delta = Delta(diff=diff, always_include_values=True)
# This will fail because path parsing is broken
result = t1 + delta

The path parsing function appears to have lost all its logic for handling brackets, dots, quotes, and other path syntax elements.

How subjects answered

  • claude-3-7-sonnet-20250219 correct
Item 12100% solve rateanswer: diff --git a/faker/providers/internet/__init__.py b/faker/providers/internet/__init__.py index d4a20a5..ae439c9 100644 --- a/faker/providers/internet/__init__.py +++ b/faker/providers/internet/__init__.py @@ -726,16 +726,24 @@ class Provider(BaseProvider): return f"{self.generator.parse(pattern)}{path}{page}{extension}" @slugify - def slug(self, value: Optional[str] = None) -> str: + def slug(self, value: Optional[str]=None) ->str: """Django algorithm""" if value is None: - # Resolve https://github.com/joke2k/faker/issues/2103 - # Always generate slug with ASCII characters, regardless of locale - ext_word_list = USLoremProvider.word_list - - value = self.generator.text(20, ext_word_list=ext_word_list) + value = USLoremProvider().text(max_nb_chars=20) + + value = self._to_ascii(value) + + # Remove characters that aren't alphanumerics, underscores, or hyphens + import re + value = re.sub(r'[^\w\s-]', '', value) + + # Convert spaces to hyphens + value = re.sub(r'[-\s]+', '-', value) + + # Remove leading/trailing hyphens + value = value.strip('-') + return value - def image_url( self, width: Optional[int] = None,

Slug generation fails with missing generator argument.

Description

After updating the slug method in the internet provider, I encountered an issue where the slug generation fails due to a missing 'generator' argument. This seems to affect multiple locales, including Japanese, Chinese, Hungarian, Polish, and others.

To reproduce the issue, you can try the following steps:

  1. Initialize a Faker instance with any locale, for example, 'ja_JP'.
  2. Attempt to generate a slug using faker.slug().
  3. Observe the error message indicating a missing 'generator' argument.

Example:

from faker import Faker

faker = Faker('ja_JP')
print(faker.slug())

Expected behavior: The slug should be generated without errors.

Actual behavior: A TypeError is raised, stating that BaseProvider.__init__() is missing the required positional argument 'generator'.

This issue seems to be consistent across various locales and needs to be addressed to ensure slug generation works as expected.

How subjects answered

  • claude-3-7-sonnet-20250219 correct

Subjects

The models, agents, and reward models evaluated.

3 subjects, ranked by mean response (accuracy) across this benchmark's items.

  1. 1claude-3-7-sonnet-202502190.4341
  2. 2claude-3-5-sonnet-202410220.4116
  3. 3gpt-4o-2024-08-060.3195