Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bpo-39939: Add str.removeprefix and str.removesuffix #18939

Merged
merged 39 commits into from
Apr 22, 2020
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
39 commits
Select commit Hold shift + click to select a range
0addc43
Add cutprefix and cutsuffix methods to str, bytes, and bytearray.
sweeneyde Mar 10, 2020
3adb9fa
pep 7: lining up argumenets
sweeneyde Mar 11, 2020
a7a1bc8
Revert "pep 7: lining up argumenets"
sweeneyde Mar 11, 2020
fe18644
pep 7: line up arguemnts
sweeneyde Mar 11, 2020
ff8e3c6
pep 7: line up arguemnts
sweeneyde Mar 11, 2020
5339a46
📜🤖 Added by blurb_it.
blurb-it[bot] Mar 11, 2020
cc85978
add UserString methods
sweeneyde Mar 11, 2020
1442ffe
Merge branch 'cut_affix' of /~https://github.com/sweeneyde/cpython into…
sweeneyde Mar 11, 2020
111b0f9
update count of objects in test_doctests
sweeneyde Mar 11, 2020
8265e4d
restore clinic output
sweeneyde Mar 11, 2020
7401b87
update count of objects in test_doctests
sweeneyde Mar 11, 2020
e550171
return original when bytes.cut***fix does not find match
sweeneyde Mar 11, 2020
0a5d0a9
Document cutprefix and cutsuffix
sweeneyde Mar 12, 2020
a126438
fix doctest in docs
sweeneyde Mar 12, 2020
fbc4a50
Add credit
sweeneyde Mar 12, 2020
3783dc3
make the empty affix case fast
sweeneyde Mar 12, 2020
428e733
clarified: one affix at a time
sweeneyde Mar 12, 2020
5796757
ensure tuples are not allowed
sweeneyde Mar 12, 2020
6fe9ac5
Fix userstring type behavior
sweeneyde Mar 12, 2020
13e8296
WhatsNew and ACKS
sweeneyde Mar 12, 2020
49fa220
WhatsNew and ACKS
sweeneyde Mar 12, 2020
550beca
fix spelling
sweeneyde Mar 12, 2020
01d0655
Direct readers from (l/r)strip to cut***fix
sweeneyde Mar 12, 2020
3c0e350
Merge branch 'cut_affix' of /~https://github.com/sweeneyde/cpython into…
sweeneyde Mar 12, 2020
fe80ba8
Fix typo in docs
sweeneyde Mar 16, 2020
ae23692
minor c formatting consistency
sweeneyde Mar 16, 2020
a9e253c
copy/paste errors; don't say 'return the original'
sweeneyde Mar 20, 2020
4c33b74
changed 'cut' to 'remove'
sweeneyde Mar 25, 2020
4413e2e
Change method names in whatsnew
sweeneyde Mar 25, 2020
5dfa968
Update Misc/NEWS.d/next/Core and Builtins/2020-03-11-19-17-36.bpo-399…
sweeneyde Mar 25, 2020
aa6eede
new names in the whatsnew header
sweeneyde Mar 28, 2020
d941711
Merge branch 'master' into cut_affix
sweeneyde Apr 9, 2020
f55836d
add examples of differences between l/rstrip and removeaffix
sweeneyde Apr 21, 2020
8d0584a
Merge branch 'cut_affix' of /~https://github.com/sweeneyde/cpython into…
sweeneyde Apr 21, 2020
8b6267a
apply changes from review
sweeneyde Apr 22, 2020
61cd530
apply changes from review
sweeneyde Apr 22, 2020
ffe72f1
more documentation tweaks
sweeneyde Apr 22, 2020
d8f5a99
clean up the NEWS entry
sweeneyde Apr 22, 2020
3df1f38
mention arg type in docstrings
sweeneyde Apr 22, 2020
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
104 changes: 102 additions & 2 deletions Doc/library/stdtypes.rst
Original file line number Diff line number Diff line change
Expand Up @@ -1549,6 +1549,33 @@ expression support in the :mod:`re` module).
interpreted as in slice notation.


.. method:: str.removeprefix(prefix, /)

If the string starts with the *prefix* string, return
``string[len(prefix):]``. Otherwise, return a copy of the original
string::

>>> 'TestHook'.removeprefix('Test')
'Hook'
>>> 'BaseTestCase'.removeprefix('Test')
'BaseTestCase'

.. versionadded:: 3.9

.. method:: str.removesuffix(suffix, /)

If the string ends with the *suffix* string and that *suffix* is not empty,
return ``string[:-len(suffix)]``. Otherwise, return a copy of the
original string::

>>> 'MiscTests'.removesuffix('Tests')
'Misc'
>>> 'TmpDirMixin'.removesuffix('Tests')
'TmpDirMixin'

.. versionadded:: 3.9


.. method:: str.encode(encoding="utf-8", errors="strict")

Return an encoded version of the string as a bytes object. Default encoding
Expand Down Expand Up @@ -1831,6 +1858,14 @@ expression support in the :mod:`re` module).
>>> 'www.example.com'.lstrip('cmowz.')
'example.com'

See :meth:`str.removeprefix` for a method that will remove a single prefix
string rather than all of a set of characters. For example::

>>> 'Arthur: three!'.lstrip('Arthur: ')
'ee!'
>>> 'Arthur: three!'.removeprefix('Arthur: ')
'three!'


.. staticmethod:: str.maketrans(x[, y[, z]])

Expand Down Expand Up @@ -1911,6 +1946,13 @@ expression support in the :mod:`re` module).
>>> 'mississippi'.rstrip('ipz')
'mississ'

See :meth:`str.removesuffix` for a method that will remove a single suffix
string rather than all of a set of characters. For example::

>>> 'Monty Python'.rstrip(' Python')
'M'
>>> 'Monty Python'.removesuffix(' Python')
'Monty'

.. method:: str.split(sep=None, maxsplit=-1)

Expand Down Expand Up @@ -2591,6 +2633,50 @@ arbitrary binary data.
Also accept an integer in the range 0 to 255 as the subsequence.


.. method:: bytes.removeprefix(prefix, /)
bytearray.removeprefix(prefix, /)

If the binary data starts with the *prefix* string, return
``bytes[len(prefix):]``. Otherwise, return a copy of the original
binary data::

>>> b'TestHook'.removeprefix(b'Test')
b'Hook'
>>> b'BaseTestCase'.removeprefix(b'Test')
b'BaseTestCase'

The *prefix* may be any :term:`bytes-like object`.

.. note::

The bytearray version of this method does *not* operate in place -
it always produces a new object, even if no changes were made.

.. versionadded:: 3.9


.. method:: bytes.removesuffix(suffix, /)
bytearray.removesuffix(suffix, /)

If the binary data ends with the *suffix* string and that *suffix* is
not empty, return ``bytes[:-len(suffix)]``. Otherwise, return a copy of
the original binary data::

>>> b'MiscTests'.removesuffix(b'Tests')
b'Misc'
>>> b'TmpDirMixin'.removesuffix(b'Tests')
b'TmpDirMixin'

The *suffix* may be any :term:`bytes-like object`.

.. note::

The bytearray version of this method does *not* operate in place -
it always produces a new object, even if no changes were made.

.. versionadded:: 3.9


.. method:: bytes.decode(encoding="utf-8", errors="strict")
bytearray.decode(encoding="utf-8", errors="strict")

Expand Down Expand Up @@ -2841,7 +2927,14 @@ produce new objects.
b'example.com'

The binary sequence of byte values to remove may be any
:term:`bytes-like object`.
:term:`bytes-like object`. See :meth:`~bytes.removeprefix` for a method
that will remove a single prefix string rather than all of a set of
characters. For example::

>>> b'Arthur: three!'.lstrip(b'Arthur: ')
b'ee!'
>>> b'Arthur: three!'.removeprefix(b'Arthur: ')
b'three!'

.. note::

Expand Down Expand Up @@ -2890,7 +2983,14 @@ produce new objects.
b'mississ'

The binary sequence of byte values to remove may be any
:term:`bytes-like object`.
:term:`bytes-like object`. See :meth:`~bytes.removesuffix` for a method
that will remove a single suffix string rather than all of a set of
characters. For example::

>>> b'Monty Python'.rstrip(b' Python')
b'M'
>>> b'Monty Python'.removesuffix(b' Python')
b'Monty'

.. note::

Expand Down
10 changes: 10 additions & 0 deletions Doc/whatsnew/3.9.rst
Original file line number Diff line number Diff line change
Expand Up @@ -105,6 +105,16 @@ Merge (``|``) and update (``|=``) operators have been added to the built-in
:class:`dict` class. See :pep:`584` for a full description.
(Contributed by Brandt Bucher in :issue:`36144`.)

PEP 616: New removeprefix() and removesuffix() string methods
-------------------------------------------------------------

:meth:`str.removeprefix(prefix)<str.removeprefix>` and
:meth:`str.removesuffix(suffix)<str.removesuffix>` have been added
to easily remove an unneeded prefix or a suffix from a string. Corresponding
``bytes``, ``bytearray``, and ``collections.UserString`` methods have also been
added. See :pep:`616` for a full description. (Contributed by Dennis Sweeney in
:issue:`18939`.)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not 18939, which is about "Venv docs regarding original python install". Shouldn't this be 39939?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, it's a typo error: @sweeneyde: can you please propose a PR to fix the typo?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or @elazarg: Do you want to propose a PR to fix the typo?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure. I thought it might be overkill :)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oops -- It looks like that's the GitHub PR number rather than the bpo number. I can't make a PR tonight so feel free to change it. If not, I can fix it tomorrow.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.



Other Language Changes
======================
Expand Down
8 changes: 8 additions & 0 deletions Lib/collections/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -1239,6 +1239,14 @@ def count(self, sub, start=0, end=_sys.maxsize):
if isinstance(sub, UserString):
sub = sub.data
return self.data.count(sub, start, end)
def removeprefix(self, prefix, /):
if isinstance(prefix, UserString):
prefix = prefix.data
return self.__class__(self.data.removeprefix(prefix))
def removesuffix(self, suffix, /):
if isinstance(suffix, UserString):
suffix = suffix.data
return self.__class__(self.data.removesuffix(suffix))
def encode(self, encoding='utf-8', errors='strict'):
encoding = 'utf-8' if encoding is None else encoding
errors = 'strict' if errors is None else errors
Expand Down
36 changes: 36 additions & 0 deletions Lib/test/string_tests.py
Original file line number Diff line number Diff line change
Expand Up @@ -682,6 +682,42 @@ def test_replace_overflow(self):
self.checkraises(OverflowError, A2_16, "replace", "A", A2_16)
self.checkraises(OverflowError, A2_16, "replace", "AA", A2_16+A2_16)

def test_removeprefix(self):
self.checkequal('am', 'spam', 'removeprefix', 'sp')
self.checkequal('spamspam', 'spamspamspam', 'removeprefix', 'spam')
self.checkequal('spam', 'spam', 'removeprefix', 'python')
self.checkequal('spam', 'spam', 'removeprefix', 'spider')
self.checkequal('spam', 'spam', 'removeprefix', 'spam and eggs')

self.checkequal('', '', 'removeprefix', '')
self.checkequal('', '', 'removeprefix', 'abcde')
self.checkequal('abcde', 'abcde', 'removeprefix', '')
self.checkequal('', 'abcde', 'removeprefix', 'abcde')

self.checkraises(TypeError, 'hello', 'removeprefix')
self.checkraises(TypeError, 'hello', 'removeprefix', 42)
self.checkraises(TypeError, 'hello', 'removeprefix', 42, 'h')
self.checkraises(TypeError, 'hello', 'removeprefix', 'h', 42)
self.checkraises(TypeError, 'hello', 'removeprefix', ("he", "l"))

def test_removesuffix(self):
self.checkequal('sp', 'spam', 'removesuffix', 'am')
self.checkequal('spamspam', 'spamspamspam', 'removesuffix', 'spam')
self.checkequal('spam', 'spam', 'removesuffix', 'python')
self.checkequal('spam', 'spam', 'removesuffix', 'blam')
self.checkequal('spam', 'spam', 'removesuffix', 'eggs and spam')

self.checkequal('', '', 'removesuffix', '')
self.checkequal('', '', 'removesuffix', 'abcde')
self.checkequal('abcde', 'abcde', 'removesuffix', '')
self.checkequal('', 'abcde', 'removesuffix', 'abcde')

self.checkraises(TypeError, 'hello', 'removesuffix')
self.checkraises(TypeError, 'hello', 'removesuffix', 42)
self.checkraises(TypeError, 'hello', 'removesuffix', 42, 'h')
self.checkraises(TypeError, 'hello', 'removesuffix', 'h', 42)
self.checkraises(TypeError, 'hello', 'removesuffix', ("lo", "l"))

def test_capitalize(self):
self.checkequal(' hello ', ' hello ', 'capitalize')
self.checkequal('Hello ', 'Hello ','capitalize')
Expand Down
2 changes: 1 addition & 1 deletion Lib/test/test_doctest.py
Original file line number Diff line number Diff line change
Expand Up @@ -665,7 +665,7 @@ def non_Python_modules(): r"""

>>> import builtins
>>> tests = doctest.DocTestFinder().find(builtins)
>>> 810 < len(tests) < 830 # approximate number of objects with docstrings
>>> 816 < len(tests) < 836 # approximate number of objects with docstrings
True
>>> real_tests = [t for t in tests if len(t.examples) > 0]
>>> len(real_tests) # objects that actually have doctests
Expand Down
1 change: 1 addition & 0 deletions Misc/ACKS
Original file line number Diff line number Diff line change
Expand Up @@ -1660,6 +1660,7 @@ Hisao Suzuki
Kalle Svensson
Andrew Svetlov
Paul Swartz
Dennis Sweeney
Al Sweigart
Sviatoslav Sydorenko
Thenault Sylvain
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
Added str.removeprefix and str.removesuffix methods and corresponding
bytes, bytearray, and collections.UserString methods to remove affixes
from a string if present.
See :pep:`616` for a full description.
Patch by Dennis Sweeney.
67 changes: 67 additions & 0 deletions Objects/bytearrayobject.c
Original file line number Diff line number Diff line change
Expand Up @@ -1186,6 +1186,71 @@ bytearray_endswith(PyByteArrayObject *self, PyObject *args)
return _Py_bytes_endswith(PyByteArray_AS_STRING(self), PyByteArray_GET_SIZE(self), args);
}

/*[clinic input]
bytearray.removeprefix as bytearray_removeprefix

prefix: Py_buffer
/

Return a bytearray with the given prefix string removed if present.

If the bytearray starts with the prefix string, return
bytearray[len(prefix):]. Otherwise, return a copy of the original
bytearray.
[clinic start generated code]*/

static PyObject *
bytearray_removeprefix_impl(PyByteArrayObject *self, Py_buffer *prefix)
/*[clinic end generated code: output=6cabc585e7f502e0 input=968aada38aedd262]*/
{
const char *self_start = PyByteArray_AS_STRING(self);
Py_ssize_t self_len = PyByteArray_GET_SIZE(self);
const char *prefix_start = prefix->buf;
Py_ssize_t prefix_len = prefix->len;

if (self_len >= prefix_len
&& memcmp(self_start, prefix_start, prefix_len) == 0)
{
return PyByteArray_FromStringAndSize(self_start + prefix_len,
self_len - prefix_len);
}

return PyByteArray_FromStringAndSize(self_start, self_len);
}

/*[clinic input]
bytearray.removesuffix as bytearray_removesuffix

suffix: Py_buffer
/

Return a bytearray with the given suffix string removed if present.

If the bytearray ends with the suffix string and that suffix is not
empty, return bytearray[:-len(suffix)]. Otherwise, return a copy of
the original bytearray.
[clinic start generated code]*/

static PyObject *
bytearray_removesuffix_impl(PyByteArrayObject *self, Py_buffer *suffix)
/*[clinic end generated code: output=2bc8cfb79de793d3 input=c1827e810b2f6b99]*/
{
const char *self_start = PyByteArray_AS_STRING(self);
Py_ssize_t self_len = PyByteArray_GET_SIZE(self);
const char *suffix_start = suffix->buf;
Py_ssize_t suffix_len = suffix->len;

if (self_len >= suffix_len
&& memcmp(self_start + self_len - suffix_len,
suffix_start, suffix_len) == 0)
{
return PyByteArray_FromStringAndSize(self_start,
self_len - suffix_len);
}

return PyByteArray_FromStringAndSize(self_start, self_len);
}


/*[clinic input]
bytearray.translate
Expand Down Expand Up @@ -2208,6 +2273,8 @@ bytearray_methods[] = {
BYTEARRAY_POP_METHODDEF
BYTEARRAY_REMOVE_METHODDEF
BYTEARRAY_REPLACE_METHODDEF
BYTEARRAY_REMOVEPREFIX_METHODDEF
BYTEARRAY_REMOVESUFFIX_METHODDEF
BYTEARRAY_REVERSE_METHODDEF
{"rfind", (PyCFunction)bytearray_rfind, METH_VARARGS, _Py_rfind__doc__},
{"rindex", (PyCFunction)bytearray_rindex, METH_VARARGS, _Py_rindex__doc__},
Expand Down
Loading