react-native icon indicating copy to clipboard operation
react-native copied to clipboard

Android `fetch` hangs indefinitely with IPv6 hosts on some devices (Happy Eyeballs)

Open andreialecu opened this issue 4 years ago • 32 comments

Description

This is a very bizarre issue that has been previously reported a bunch of times, and this is basically a continuation of:

https://github.com/facebook/react-native/issues/29608

I initially started running into this on RN 0.66 with AWS Cognito. Bumping to 0.66.3 didn't help.

I'm also pretty sure this used to work before and I'm not sure when it broke. It's on an app that has been shelved for a while.

The problem is very strange because the network request does not seem to be issued, but simply hitting CMD+S to save any file so that a hot-reload is issued will immediately dispatch the network request.

I discovered the promise hanging issue by adding some logs to the fetch calls the cognito library was doing: image Notice how the .then is not executed.

While troubleshooting I came across a mention here of a workaround: https://github.com/facebook/react-native/issues/29608#issuecomment-884521699 (courtesy of @danmaas) which seems to completely resolve the issue.

Here's the same .then correctly being executed after applying that patch: image

Version

0.66.3

Output of react-native info

System:
    OS: macOS 12.0.1
    CPU: (8) arm64 Apple M1
    Memory: 142.27 MB / 16.00 GB
    Shell: 5.8 - /bin/zsh
  Binaries:
    Node: 16.13.0 - /private/var/folders/9p/k1yqxx0d7rn1nlztg_wm7sbw0000gn/T/xfs-2dcae145/node
    Yarn: 2.4.0-git.20210330.hash-ebcd71d5 - /private/var/folders/9p/k1yqxx0d7rn1nlztg_wm7sbw0000gn/T/xfs-2dcae145/yarn
    npm: 7.20.1 - ~/.nvm/versions/node/v16.13.0/bin/npm
    Watchman: 2021.11.01.00 - /opt/homebrew/bin/watchman
  Managers:
    CocoaPods: 1.11.0 - /Users/andreialecu/.rbenv/shims/pod
  SDKs:
    iOS SDK:
      Platforms: DriverKit 21.0.1, iOS 15.0, macOS 12.0, tvOS 15.0, watchOS 8.0
    Android SDK: Not Found
  IDEs:
    Android Studio: 4.2 AI-202.7660.26.42.7351085
    Xcode: 13.1/13A1030d - /usr/bin/xcodebuild
  Languages:
    Java: 11.0.8 - /Applications/Android Studio.app/Contents/jre/jdk/Contents/Home/bin/javac
  npmPackages:
    @react-native-community/cli: Not Found
    react: 17.0.2 => 17.0.2 
    react-native: 0.66.3 => 0.66.3 
    react-native-macos: Not Found
  npmGlobalPackages:
    *react-native*: Not Found

Steps to reproduce

I'm able to reproduce it with this:

  React.useEffect(() => {
    console.log('confirm start', new Date());
    fetch('https://cognito-idp.eu-west-1.amazonaws.com/', {
      method: 'POST',
      mode: 'cors',
    })
      .then(() => console.log('then', new Date()))
      .catch(() => console.log('catch', new Date()));
    setTimeout(() => {
      console.log('5 seconds passed');
    }, 5000);
  }, []);

Output: Screenshot 2021-12-09 at 19 29 06

After applying https://github.com/facebook/react-native/issues/29608#issuecomment-884521699: Screenshot 2021-12-09 at 19 30 32

Snack, code example, screenshot, or link to a repository

No response

Skip to this comment for the actual cause: https://github.com/facebook/react-native/issues/32730#issuecomment-990764376

andreialecu avatar Dec 09 '21 17:12 andreialecu

Note that this doesn't repro on the emulator, this is the output on the emulator:

 LOG  confirm x 2021-12-09T18:44:08.229Z
 LOG  then x 2021-12-09T18:44:08.607Z
 LOG  5 seconds passed

❗ The device where this is happening, and where I initially noticed it on is an Oppo A72 (CPH2067) running Android 11

andreialecu avatar Dec 09 '21 18:12 andreialecu

I'm also not able to reproduce it on a Samsung Galaxy Tab A (SM-T290) device.

andreialecu avatar Dec 09 '21 19:12 andreialecu

In my case, this issue also affected only some devices. I think it has something to do with the network stack or driver on the device. Or how they interact with IPv6 devices out on the network.

danmaas avatar Dec 09 '21 19:12 danmaas

Interesting. I left it running for a while and it actually seems to time out on the connection:

 LOG  confirm x 2021-12-09T19:00:10.557Z
 LOG  5 seconds passed
 LOG  then x 2021-12-09T19:04:23.177Z

It took 4 minutes to time out.

I just noticed that the device does not have IPv6 connectivity at all based on https://test-ipv6.com/

Might be a router problem on my end, usually I have IPv6. However, I'm not sure why it would even attempt to use an IPv6 address considering it does not have IPv6 connectivity at all.

andreialecu avatar Dec 09 '21 19:12 andreialecu

Very interesting, I rebooted the router and now I have IPv6 again.

❗ The Oppo now no longer has the error. It works perfectly:

 LOG  confirm x 2021-12-09T19:28:07.714Z
 LOG  then x 2021-12-09T19:28:09.217Z
 LOG  5 seconds passed

I'm not sure what to make of this. 🤔

andreialecu avatar Dec 09 '21 19:12 andreialecu

Yo mismo me encargo de resolver lo de sus vidas, ni hay ni habrá problema alguno, temor alguno o alguien que me agreda saludo sigan igual y cada vez mejoren más, ok. Si se puede. Atte: Jesus Francisco Urias García.

vamper424 avatar Dec 09 '21 22:12 vamper424

Y nuevamente tu gol, esta vez artificial.

vamper424 avatar Dec 09 '21 22:12 vamper424

Jajaja mediocres ellos, sobre ellos.

vamper424 avatar Dec 09 '21 22:12 vamper424

The root cause of this issue seems to be the lack of "Happy Eyeballs" in the underlying okhttp library that react-native is using on Android.

Here's the issue in okhttp that is tracking this: https://github.com/square/okhttp/issues/506

I think most of the random networking Android slowdowns reported all over the ecosystem are probably related. This is the one I ran into in particular: https://github.com/aws-amplify/amplify-js/issues/5539

Possibly related issues: https://github.com/facebook/react-native/issues/32467 https://github.com/facebook/react-native/issues/29782

This thread also mentions this exact same issue: https://github.com/facebook/react-native/issues/28283#issuecomment-736941578

I believe a PR to implement Happy Eyeballs into okhttp shouldn't be too hard, if someone would be able to contribute one.

From my understanding a TCP connection to both the IPv6 and IPv4 address needs to be attempted, and the first one to respond is used, while the other one is closed.

This would solve this ubiquitous Android networking problem once and for all.

andreialecu avatar Dec 10 '21 09:12 andreialecu

@andreialecu would you be so kind to review my PR?

marcesengel avatar Jan 16 '22 10:01 marcesengel

@marcesengel I think this should be addressed in the underlying okhttp library. Perhaps you might want to contribute it there instead?

There is some movement there that indicates this problem might finally get some attention.

See: https://github.com/square/okhttp/pull/7009

andreialecu avatar Jan 21 '22 13:01 andreialecu

@andreialecu looking at the old issue regarding this a change on their end seemed unrealistic without a new major release (due to interceptors needing to be raced, see for example this comment, which reads "It's an invasive feature to add because there's extra complexity. If you need it, don't wait for us."), which in turn would mean that it probably takes some time to adapt this to react-native.

I don't see any issue with implementing it for RN in the meantime. I'd have added it to okhttp if it didn't look like it'd take ages to catch on - in case it ever did.

marcesengel avatar Jan 21 '22 14:01 marcesengel

I see. That makes sense.

Not sure who would need to take a look at this. I can't review it myself since I'm not familiar with the RN code dealing with this.

Perhaps @yungsters would be able to take a look?

andreialecu avatar Jan 21 '22 14:01 andreialecu

@marcesengel I think this should be addressed in the underlying okhttp library. Perhaps you might want to contribute it there instead?

Essentially this. Please take a look at my answer here: https://github.com/facebook/react-native/pull/33045#issuecomment-1030221142 as I believe that's the most feasible solution we could take at this stage.

cortinico avatar Feb 04 '22 18:02 cortinico

I've built a patched version for use until OkHTTP 5.x is stable (now that since the first of February the implementation of Happy Eyeballs is confirmed). It's available as marcesengel/react-native#0.66-patched-3.

marcesengel avatar Feb 04 '22 19:02 marcesengel

Cross-posting here as I realized I accidentally left the comment on the PR instead of the Issue: https://github.com/facebook/react-native/pull/33045#issuecomment-1030428147

Would you be up for opening a draft PR with the OkHTTP bump to 5.x alpha? It could help us spot early breaking changes and we could test it against the internal infra to make sure nothing breaks. (That's also a task that someone else in the community can pickup if they wish).

cortinico avatar Feb 04 '22 23:02 cortinico

I'd be open to do the transition once major version 5 is stable, as for now there's a fix and I wouldn't want to change something multiple times in case more breaking changes are introduced before v5 hits stable. If somebody else would like to do it beforehand and take the risk in order to make it available faster that of course would be much appreciated :+1:

Do you know if there's an ETA for the stable release?

marcesengel avatar Feb 05 '22 12:02 marcesengel

Nope but you can subscribe here to get updates

  • https://github.com/square/okhttp/issues/506

cortinico avatar Feb 08 '22 15:02 cortinico

Do you know if there's an ETA for the stable release?

The upstream PR was just merged. I believe a stable release is not done yet. According to https://github.com/square/okhttp/issues/6954#issuecomment-1046608631 a new alpha is pending release.

Also, as mentioned here: https://github.com/square/okhttp/issues/506#issuecomment-1046824772

If you're prepared to deal with small API changes between now and 5.0 final you can put it into production today. For OkHttp ‘alpha’ means ‘API instability’ but the code is extremely stable.

andreialecu avatar Feb 21 '22 12:02 andreialecu

+1 any feedback on how the 5.0.0-Alpha05 version is working would be great. Or try using it and seeing what you need in terms of observability. Now is a good time to shape the final result.

yschimke avatar Feb 26 '22 10:02 yschimke

I think I can find some time to create a PR this week. However as of now I really am not aware of how big the API changes are, so I can't say if I'll finish it this week.

Does somebody know if switching to major version 5 will affect Fresco?

marcesengel avatar Feb 28 '22 11:02 marcesengel

It should be a drop in replacement, it's backwards compatible down to 3.X. But an extra call to enable

OkHttpClient client = new OkHttpClient.Builder()
    .fastFallback(true)
    .build();

yschimke avatar Feb 28 '22 11:02 yschimke

I think I can find some time to create a PR this week.

That's great 👍 Please ping me over the PR once you're done with it.

Does somebody know if switching to major version 5 will affect Fresco?

As @yschimke mentioned, we should not need any action on the Fresco side of things. I can loop them over if this is needed anyway.

cortinico avatar Mar 01 '22 10:03 cortinico

Small update: since I'm working on an M1 MacBook I had to first fix Buck for M1 chips and created a PR there https://github.com/facebook/buck/pull/2684 (for some reason Docker doesn't work as well and this looked easier to fix). NDK 21.4 wasn't supported as well, as it ships clang 9.0.9 instead of 9.0.8.... I was wondering how the builds at Facebook are working?

Will be able to start work on the PR tomorrow 👍

marcesengel avatar Mar 03 '22 17:03 marcesengel

I was wondering how the builds at Facebook are working?

I'm not on a M1 so I haven't faced those issues at all. Thanks for pointing them out and addressing them though 👍

cortinico avatar Mar 03 '22 18:03 cortinico

Sorry for not putting it better - I wasn't meaning the M1 issues but the clang version included in the NDK, as RN is on NDK 21.4.xxx now which ships with clang 9.0.9 while Buck is looking for clang 9.0.8 vor every NDK major > 20.

Regardless I still have issues running the tests, as Buck tries to compile yoga with only -fno-omit-frame-pointer -O3 while in YGEnums.h for example constexpr is used, which would require at least C++ 11. When also passing -std=c++17 to fix this, I get String table does not end at end of file at react-native/buck-out/gen/ReactAndroid/src/main/jni/first-party/yogajni/jni#default,shared/libyoga.dylib 😖

I'd really like to get local unit tests working before I try to bump OkHttp3. Do you want me to create a new issue to tackle my problems with this?

Edit: On the official Clang website it says by default C++ 14 is used? Why am I having issues with this, maybe some Apple Clang specifics? https://clang.llvm.org/cxx_status.html

marcesengel avatar Mar 04 '22 16:03 marcesengel

I'd really like to get local unit tests working before I try to bump OkHttp3. Do you want me to create a new issue to tackle my problems with this?

Have you considered using the docker image instead?

cortinico avatar Mar 04 '22 18:03 cortinico

Yes, sadly that didn't work. Using the image I got a SIGILL (Could not load hsdis-amd64.so; library not loadable; PrintAssembly is disabled). According to some sources this can be because the CPU (or in this case the x86 emulator) is missing an instruction, which could be fixed using the java flag -XX:UseSSE=2. I tried supplying it using .buckjavaargs to no avail. Alternatively hsdis-amd64.so could just actually be missing, which is why I tried building the reactnativecommunity/react-native-android image on my machine. On arch amd64 this got stuck without showing any errors etc. on the step [javac] Compiling 109 source files to /buck/ant-out/src-gen/classes [writing all of this down that might have been due to the problem described above, maybe I should try building the container with -XX:UseSSE=2]. Then I tried building the, on my machine native, arm64v8 arch which completed the step the other arch was stuck at, but ultimately failed to install the android emulator due to missing binaries iirc.

Then I concluded: my machine has an emulator and everything set up, why don't I just do everything on the host? And then I ran into the issues described in my messages above. Thinking about it again the unit tests might not require an emulator, do they? If not I could emit the emulator from my local arm64v8 image and use it to execute the unit tests.

marcesengel avatar Mar 04 '22 19:03 marcesengel

I'm not sure if you even need buck to try this, honestly. You can do everything on the Gradle side of things (on Buck you'll have to bump the version number of OkHTTP).

I would replicate this commit and fix any build failures you might face via the CI or via a Gradle build: https://github.com/facebook/react-native/commit/d6db5c54640cef999b6922a10ceb85dede929b3f

cortinico avatar Mar 10 '22 12:03 cortinico

Yup been thinking the same recently, will do so until everything is ready for M1. It also seems like it's a Buck issue with the bumped 3rd party libs as someone else ran into the same problem building something else. Will also look into this.

marcesengel avatar Mar 10 '22 12:03 marcesengel