bug-dataset Failing tests on the fixed version

Hi,

It seems there are some (unexpected) failing test cases on the fixed version of, e.g., Express-1f. Here is the step-by-step to reproduce this issue.

My system:

$ sw_vers
ProductName:		macOS
ProductVersion:		13.0.1
BuildVersion:		22A400

$ git --version
git version 2.38.1

$ python3 --version
Python 3.10.8

$ node --version
v18.11.0

$ npm --version
8.19.2

# Get BugsJS
$ rm -rf /tmp/bugsjs-issue-11
$ git clone https://github.com/BugsJS/bug-dataset.git /tmp/bugsjs-issue-11
$ cd /tmp/bugsjs-issue-11

# Run `test` on Express-1f
$ python3 main.py -p Express -b 1 -t test -v fixed -o Express-1f-test/

which produces the following log

Cloning into 'express'...
remote: Enumerating objects: 30151, done.
remote: Counting objects: 100% (2/2), done.
remote: Compressing objects: 100% (2/2), done.
remote: Total 30151 (delta 0), reused 0 (delta 0), pack-reused 30149
Receiving objects: 100% (30151/30151), 8.56 MiB | 8.51 MiB/s, done.
Resolving deltas: 100% (17015/17015), done.
Note: switching to 'tags/Bug-1-full'.

You are in 'detached HEAD' state. You can look around, make experimental
changes and commit them, and you can discard any commits you make in this
state without impacting any branches by switching back to a branch.

If you want to create a new branch to retain commits you create, you may
do so (now or later) by using -c with the switch command. Example:

  git switch -c <new-branch-name>

Or undo this operation with:

  git switch -

Turn off this advice by setting config variable advice.detachedHead to false

HEAD is now at 8e0080e1 Bug-1 full
/bin/sh: n: command not found
npm WARN deprecated [email protected]: 'native-or-bluebird' is deprecated. Please use 'any-promise' instead.
npm WARN deprecated [email protected]: Deprecated, use jstransformer
npm WARN deprecated [email protected]: Please update to at least constantinople 3.1.1
npm WARN deprecated [email protected]: please upgrade to graceful-fs 4 for compatibility with current and future versions of Node.js
npm WARN deprecated [email protected]: Please update to minimatch 3.0.2 or higher to avoid a RegExp DoS issue
npm WARN deprecated [email protected]: Please update to minimatch 3.0.2 or higher to avoid a RegExp DoS issue
npm WARN deprecated [email protected]: Please update to minimatch 3.0.2 or higher to avoid a RegExp DoS issue
npm WARN deprecated [email protected]: Legacy versions of mkdirp are no longer supported. Please update to mkdirp 1.x. (Note that the API surface has changed to use Promises in 1.x.)
npm WARN deprecated [email protected]: Legacy versions of mkdirp are no longer supported. Please update to mkdirp 1.x. (Note that the API surface has changed to use Promises in 1.x.)
npm WARN deprecated [email protected]: Legacy versions of mkdirp are no longer supported. Please update to mkdirp 1.x. (Note that the API surface has changed to use Promises in 1.x.)
npm WARN deprecated [email protected]: Please upgrade to v7.0.2+ of superagent.  We have fixed numerous issues with streams, form-data, attach(), filesystem errors not bubbling up (ENOENT on attach()), and all tests are now passing.  See the releases tab for more information at <https://github.com/visionmedia/superagent/releases>.
npm WARN deprecated [email protected]: Jade has been renamed to pug, please install the latest version of pug instead of jade
npm WARN deprecated [email protected]: Critical security bugs fixed in 2.5.5
npm WARN deprecated [email protected]: Please upgrade to latest, formidable@v2 or formidable@v3! Check these notes: https://bit.ly/2ZEqIau
npm WARN deprecated [email protected]: Jade has been renamed to pug, please install the latest version of pug instead of jade
npm WARN deprecated [email protected]: Mocha v2.0.x is no longer supported.
npm WARN deprecated [email protected]: This module is no longer maintained, try this instead:
npm WARN deprecated   npm i nyc
npm WARN deprecated Visit https://istanbul.js.org/integrations for other alternatives.

added 173 packages, and audited 174 packages in 12s

1 package is looking for funding
  run `npm fund` for details

38 vulnerabilities (1 low, 6 moderate, 21 high, 10 critical)

To address all issues (including breaking changes), run:
  npm audit fix --force

Run `npm audit` for details.
(node:95589) [DEP0005] DeprecationWarning: Buffer() is deprecated due to security and usability issues. Please use the Buffer.alloc(), Buffer.allocUnsafe(), or Buffer.from() methods instead.
(Use `node --trace-deprecation ...` to show where the warning was created)
(node:95589) [DEP0066] DeprecationWarning: OutgoingMessage.prototype._headers is deprecated
Number of tests: 737
	passes: 718
	failures: 21
	pending: 0
/bin/sh: istanbul: command not found

21 failing tests. Given it I'm running test on the fixed version I was expecting 0 failures. Please find attached the resulting test_results.json.zip file. Is this somehow expected? Do you spot any miss configuration on my end?

Miscellaneous

A couple of errors I noticed in the log which may or may not be the reason for the 21 failing tests:

/bin/sh: n: command not found which is thrown by this line of code.
/bin/sh: istanbul: command not found which is thrown by the code-coverage step. I've tried to add sp.call("npm install istanbul", shell=True) to the run_npm_install function to somehow force the installation of istanbul, but it didn't work.

-- Best, Jose

Feb 23 '23 13:02 jose

After analyzing the provided docker and reproducing the installation steps, I've managed to easily address the miscellaneous issues.

The /bin/sh: n: command not found can be addressed by running npm install -g n. (I was not aware of the n package to manage different versions of node.)
The /bin/sh: istanbul: command not found can be addressed by running npm install -g istanbul. (It seems that -g (i.e., global) did the trick.)

And I'm now down to 11 failing tests: 8 unit tests and 3 acceptance tests.

Feb 24 '23 06:02 jose

Well, according to the Projects/Express/Express_bugs.csv file, there are a total number of 732 tests on Express-1: 721 pass and 11 fail, and I'm now getting those exact numbers with the test command. For the record, here's the list of failing tests on Express-1b and Express-1f.

Test	JS file
"should restore req.params after leaving router"	test/app.router.js
"should be invoked instead of auto-responding"	test/res.format.js
"should respond with html"	test/res.redirect.js
"should escape the url"	test/res.redirect.js
"should respond with text"	test/res.redirect.js
"should encode the url"	test/res.redirect.js
"should set body to \"\""	test/res.send.js
"should invoke the callback on 404"	test/res.sendFile.js
"should succeed with proper cookie"	test/acceptance/auth.js
"should fail without proper password"	test/acceptance/auth.js
"should succeed with proper credentials"	test/acceptance/auth.js

Point is, if I've the same set of failing tests on the fixed (aka non-buggy) and buggy version, which tests truly trigger the buggy behavior? I would say none, but I might be missing something here.

Feb 24 '23 07:02 jose

Hmm, I think I finally managed to understand the issue. There are 11 failing tests on the fixed and on the buggy version, and 12 failing tests on the fixed-only-test-change version. The 11 tests could be considered broken tests (i.e., they fail due to any other reason than the Express-1 bug) and the extra one is the one that triggers the buggy behavior.

Test	JS file	buggy	fixed	fixed-only-test-change	fault-revealing
"should restore req.params after leaving router"	test/app.router.js	Fail	Fail	Fail	No
"should be invoked instead of auto-responding"	test/res.format.js	Fail	Fail	Fail	No
"should respond with html"	test/res.redirect.js	Fail	Fail	Fail	No
"should escape the url"	test/res.redirect.js	Fail	Fail	Fail	No
"should respond with text"	test/res.redirect.js	Fail	Fail	Fail	No
"should encode the url"	test/res.redirect.js	Fail	Fail	Fail	No
"should set body to \"\""	test/res.send.js	Fail	Fail	Fail	No
"should invoke the callback on 404"	test/res.sendFile.js	Fail	Fail	Fail	No
"should succeed with proper cookie"	test/acceptance/auth.js	Fail	Fail	Fail	No
"should fail without proper password"	test/acceptance/auth.js	Fail	Fail	Fail	No
"should succeed with proper credentials"	test/acceptance/auth.js	Fail	Fail	Fail	No
"should only include each method once"	test/app.options.js	---	Pass	Fail	Yes

Is there any way/procedure/script to automatically remove the 11 broken tests from, at least, the fixed version? Otherwise, how would one be able to compute accurate code coverage or mutation score of a suite that has 11 failing tests?

PS: A message to my future self, the fixed-only-test-change version corresponds (roughly) to the buggy version on Defects4J.

Feb 24 '23 07:02 jose

Thanks for your exploration on this, @jose, it really helped me. I wrote some scripts to collect the tests that we should expect to pass on the "fixed" version and fail on the "fixed-only-test-change" version. Attaching this here in case someone else comes across and tries to use this dataset.

You should run run_all_tests.sh first inside the bugsjs Docker container, then run collect_all_tests.py to read and summarize the test results. Some paths are specific to my setup.

collect_all_tests.zip

Jan 27 '25 15:01 bstee615