Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

test: reducer CUDA kernel tests #3162

Merged
merged 37 commits into from
Jun 25, 2024
Merged

test: reducer CUDA kernel tests #3162

merged 37 commits into from
Jun 25, 2024

Conversation

ManasviGoyal
Copy link
Collaborator

@ManasviGoyal ManasviGoyal commented Jun 21, 2024

  1. fixes the error for EmptyArray case in reducers
  2. adds all test_0115_generic_reducer_operation.py tests for axis=-1 only in tests-cuda
  3. adds cuda reducer tests for checking block boundary cases for array_size = 3000
Copy link
Collaborator

@ianna ianna left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ManasviGoyal - there are some issues with this:

tests-cuda/test_3162_cuda_generic_reducer_operation.py ...FFFFFFFFFFFFFF [ 40%]
FFFFFFFEFEFEFEFEFEFEFEFEFEFEFEFEFEFEFEFEFEFE                             [100%]

==================================== ERRORS ====================================
_____ ERROR at teardown of test_0115_generic_reducer_operation_EmptyArray ______

cls = <class '_pytest.runner.CallInfo'>
func = <function call_runtest_hook.<locals>.<lambda> at 0x774d45102340>
when = 'teardown'
reraise = (<class '_pytest.outcomes.Exit'>, <class 'KeyboardInterrupt'>)
result     = None
start      = 1719228956.2746673
stop       = 1719228956.2760165
when       = 'teardown'

../../../anaconda3/lib/python3.11/site-packages/_pytest/runner.py:341: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
../../../anaconda3/lib/python3.11/site-packages/_pytest/runner.py:262: in <lambda>
    lambda: ihook(item=item, **kwds), when=when, reraise=reraise
        ihook      = <_HookCaller 'pytest_runtest_teardown'>
        item       = <Function test_0115_generic_reducer_operation_EmptyArray>
        kwds       = {'nextitem': <Function test_0115_generic_reducer_operation_IndexedOptionArray_1>}
../../../anaconda3/lib/python3.11/site-packages/pluggy/_hooks.py:265: in __call__
    return self._hookexec(self.name, self.get_hookimpls(), kwargs, firstresult)
        argname    = 'nextitem'
        args       = ()
        firstresult = False
        kwargs     = {'item': <Function test_0115_generic_reducer_operation_EmptyArray>, 'nextitem': <Function test_0115_generic_reducer_operation_IndexedOptionArray_1>}
        self       = <_HookCaller 'pytest_runtest_teardown'>
../../../anaconda3/lib/python3.11/site-packages/pluggy/_manager.py:80: in _hookexec
    return self._inner_hookexec(hook_name, methods, kwargs, firstresult)
        firstresult = False
        hook_name  = 'pytest_runtest_teardown'
        kwargs     = {'item': <Function test_0115_generic_reducer_operation_EmptyArray>, 'nextitem': <Function test_0115_generic_reducer_operation_IndexedOptionArray_1>}
        methods    = [<HookImpl plugin_name='runner', plugin=<module '_pytest.runner' from '/home/ianna/anaconda3/lib/python3.11/site-packa...odule '_pytest.threadexception' from '/home/ianna/anaconda3/lib/python3.11/site-packages/_pytest/threadexception.py'>>]
        self       = <_pytest.config.PytestPluginManager object at 0x774d60e97190>
../../../anaconda3/lib/python3.11/site-packages/_pytest/unraisableexception.py:93: in pytest_runtest_teardown
    yield from unraisable_exception_runtest_hook()
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

    def unraisable_exception_runtest_hook() -> Generator[None, None, None]:
        with catch_unraisable_exception() as cm:
            yield
            if cm.unraisable:
                if cm.unraisable.err_msg is not None:
                    err_msg = cm.unraisable.err_msg
                else:
                    err_msg = "Exception ignored in"
                msg = f"{err_msg}: {cm.unraisable.object!r}\n\n"
                msg += "".join(
                    traceback.format_exception(
                        cm.unraisable.exc_type,
                        cm.unraisable.exc_value,
                        cm.unraisable.exc_traceback,
                    )
                )
>               warnings.warn(pytest.PytestUnraisableExceptionWarning(msg))
E               pytest.PytestUnraisableExceptionWarning: Exception ignored in: 'cupy.cuda.memory.Memory.__dealloc__'
E               
E               Traceback (most recent call last):
E                 File "cupy_backends/cuda/api/runtime.pyx", line 570, in cupy_backends.cuda.api.runtime.free
E                 File "cupy_backends/cuda/api/runtime.pyx", line 146, in cupy_backends.cuda.api.runtime.check_status
E               cupy_backends.cuda.api.runtime.CUDARuntimeError: cudaErrorIllegalAddress: an illegal memory access was encounter
ed

cm         = <_pytest.unraisableexception.catch_unraisable_exception object at 0x774d3078af50>
err_msg    = 'Exception ignored in'
msg        = 'Exception ignored in: \'cupy.cuda.memory.Memory.__dealloc__\'\n\nTraceback (most recent call last):\n  File "cupy_
bac...\ncupy_backends.cuda.api.runtime.CUDARuntimeError: cudaErrorIllegalAddress: an illegal memory access was encountered\n'

../../../anaconda3/lib/python3.11/site-packages/_pytest/unraisableexception.py:78: PytestUnraisableExceptionWarning
--------------------------- Captured stderr teardown ---------------------------
Traceback (most recent call last):
  File "cupy_backends/cuda/api/runtime.pyx", line 570, in cupy_backends.cuda.api.runtime.free
  File "cupy_backends/cuda/api/runtime.pyx", line 146, in cupy_backends.cuda.api.runtime.check_status
cupy_backends.cuda.api.runtime.CUDARuntimeError: cudaErrorIllegalAddress: an illegal memory access was encountered
Traceback (most recent call last):
  File "cupy_backends/cuda/api/runtime.pyx", line 570, in cupy_backends.cuda.api.runtime.free
  File "cupy_backends/cuda/api/runtime.pyx", line 146, in cupy_backends.cuda.api.runtime.check_status
cupy_backends.cuda.api.runtime.CUDARuntimeError: cudaErrorIllegalAddress: an illegal memory access was encountered
Traceback (most recent call last):
...
dev/generate-tests.py Outdated Show resolved Hide resolved
Co-authored-by: Ianna Osborne <ianna.osborne@cern.ch>
@ManasviGoyal
Copy link
Collaborator Author

@ManasviGoyal - there are some issues with this:

tests-cuda/test_3162_cuda_generic_reducer_operation.py ...FFFFFFFFFFFFFF [ 40%]
FFFFFFFEFEFEFEFEFEFEFEFEFEFEFEFEFEFEFEFEFEFE                             [100%]

==================================== ERRORS ====================================
_____ ERROR at teardown of test_0115_generic_reducer_operation_EmptyArray ______

cls = <class '_pytest.runner.CallInfo'>
func = <function call_runtest_hook.<locals>.<lambda> at 0x774d45102340>
when = 'teardown'
reraise = (<class '_pytest.outcomes.Exit'>, <class 'KeyboardInterrupt'>)
result     = None
start      = 1719228956.2746673
stop       = 1719228956.2760165
when       = 'teardown'

../../../anaconda3/lib/python3.11/site-packages/_pytest/runner.py:341: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
../../../anaconda3/lib/python3.11/site-packages/_pytest/runner.py:262: in <lambda>
    lambda: ihook(item=item, **kwds), when=when, reraise=reraise
        ihook      = <_HookCaller 'pytest_runtest_teardown'>
        item       = <Function test_0115_generic_reducer_operation_EmptyArray>
        kwds       = {'nextitem': <Function test_0115_generic_reducer_operation_IndexedOptionArray_1>}
../../../anaconda3/lib/python3.11/site-packages/pluggy/_hooks.py:265: in __call__
    return self._hookexec(self.name, self.get_hookimpls(), kwargs, firstresult)
        argname    = 'nextitem'
        args       = ()
        firstresult = False
        kwargs     = {'item': <Function test_0115_generic_reducer_operation_EmptyArray>, 'nextitem': <Function test_0115_generic_reducer_operation_IndexedOptionArray_1>}
        self       = <_HookCaller 'pytest_runtest_teardown'>
../../../anaconda3/lib/python3.11/site-packages/pluggy/_manager.py:80: in _hookexec
    return self._inner_hookexec(hook_name, methods, kwargs, firstresult)
        firstresult = False
        hook_name  = 'pytest_runtest_teardown'
        kwargs     = {'item': <Function test_0115_generic_reducer_operation_EmptyArray>, 'nextitem': <Function test_0115_generic_reducer_operation_IndexedOptionArray_1>}
        methods    = [<HookImpl plugin_name='runner', plugin=<module '_pytest.runner' from '/home/ianna/anaconda3/lib/python3.11/site-packa...odule '_pytest.threadexception' from '/home/ianna/anaconda3/lib/python3.11/site-packages/_pytest/threadexception.py'>>]
        self       = <_pytest.config.PytestPluginManager object at 0x774d60e97190>
../../../anaconda3/lib/python3.11/site-packages/_pytest/unraisableexception.py:93: in pytest_runtest_teardown
    yield from unraisable_exception_runtest_hook()
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

    def unraisable_exception_runtest_hook() -> Generator[None, None, None]:
        with catch_unraisable_exception() as cm:
            yield
            if cm.unraisable:
                if cm.unraisable.err_msg is not None:
                    err_msg = cm.unraisable.err_msg
                else:
                    err_msg = "Exception ignored in"
                msg = f"{err_msg}: {cm.unraisable.object!r}\n\n"
                msg += "".join(
                    traceback.format_exception(
                        cm.unraisable.exc_type,
                        cm.unraisable.exc_value,
                        cm.unraisable.exc_traceback,
                    )
                )
>               warnings.warn(pytest.PytestUnraisableExceptionWarning(msg))
E               pytest.PytestUnraisableExceptionWarning: Exception ignored in: 'cupy.cuda.memory.Memory.__dealloc__'
E               
E               Traceback (most recent call last):
E                 File "cupy_backends/cuda/api/runtime.pyx", line 570, in cupy_backends.cuda.api.runtime.free
E                 File "cupy_backends/cuda/api/runtime.pyx", line 146, in cupy_backends.cuda.api.runtime.check_status
E               cupy_backends.cuda.api.runtime.CUDARuntimeError: cudaErrorIllegalAddress: an illegal memory access was encounter
ed

cm         = <_pytest.unraisableexception.catch_unraisable_exception object at 0x774d3078af50>
err_msg    = 'Exception ignored in'
msg        = 'Exception ignored in: \'cupy.cuda.memory.Memory.__dealloc__\'\n\nTraceback (most recent call last):\n  File "cupy_
bac...\ncupy_backends.cuda.api.runtime.CUDARuntimeError: cudaErrorIllegalAddress: an illegal memory access was encountered\n'

../../../anaconda3/lib/python3.11/site-packages/_pytest/unraisableexception.py:78: PytestUnraisableExceptionWarning
--------------------------- Captured stderr teardown ---------------------------
Traceback (most recent call last):
  File "cupy_backends/cuda/api/runtime.pyx", line 570, in cupy_backends.cuda.api.runtime.free
  File "cupy_backends/cuda/api/runtime.pyx", line 146, in cupy_backends.cuda.api.runtime.check_status
cupy_backends.cuda.api.runtime.CUDARuntimeError: cudaErrorIllegalAddress: an illegal memory access was encountered
Traceback (most recent call last):
  File "cupy_backends/cuda/api/runtime.pyx", line 570, in cupy_backends.cuda.api.runtime.free
  File "cupy_backends/cuda/api/runtime.pyx", line 146, in cupy_backends.cuda.api.runtime.check_status
cupy_backends.cuda.api.runtime.CUDARuntimeError: cudaErrorIllegalAddress: an illegal memory access was encountered
Traceback (most recent call last):
...

@ianna Hi, I have already added the fix for this so there shouldn't be any error. Can you pull all the changes in this PR and test again?

Copy link
Collaborator

@ianna ianna left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ManasviGoyal - all tests pass, thanks!

@ManasviGoyal ManasviGoyal marked this pull request as ready for review June 24, 2024 14:48
@ManasviGoyal
Copy link
Collaborator Author

ManasviGoyal commented Jun 24, 2024

@ManasviGoyal - all tests pass, thanks!

@ianna Great! This PR and #3136 can be merged once the MacOS issue in the CI is fixed. Thanks!

@ManasviGoyal ManasviGoyal mentioned this pull request Jun 24, 2024
13 tasks
@jpivarski
Copy link
Member

This PR has a lot of conflicts with #3136, which has now been merged. The conflicts are in the kernels themselves, so it's something @ManasviGoyal will have to look at.

@ManasviGoyal
Copy link
Collaborator Author

This PR has a lot of conflicts with #3136, which has now been merged. The conflicts are in the kernels themselves, so it's something @ManasviGoyal will have to look at.

@jpivarski I have fixed the conflicts. If the tests pass, this should be ready for merge.

Copy link
Member

@jpivarski jpivarski left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This includes corrections to the kernels as new tests, so it looks like the new tests found and fixed some problems.

I just ran them all on my GPU and everything still works. I think this is ready to merge, so I'll merge it now.

@jpivarski jpivarski merged commit a1da072 into main Jun 25, 2024
39 checks passed
@jpivarski jpivarski deleted the ManasviGoyal/reducer-tests branch June 25, 2024 17:19
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
3 participants