[ANR] Input dispatching timed out. Discuss thread (SOLVED)

#1

So, I have to start theme to discuss this ANR. As I know, all Defold games now have this type of ANR. On Family Age project it is very critical for now, because we have around 1% ANR (with threshold ~0.47%). I want to discuss this, what we can do and that the problem.


I have the two types of ANR: (50% / 50%)

Input dispatching timed out (com.test.familyage/com.dynamo.android.DefoldActivity, Waiting to send key event because the focused window has not finished processing all of the input events that were previously delivered to it. Outbound queue length: 0. Wait queue length: 1.)

#00 pc 0000000000048218 /system/lib/libc.so (__epoll_pwait+20)
#01 pc 0000000000019ebd /system/lib/libc.so (epoll_pwait+60)
#02 pc 0000000000019eed /system/lib/libc.so (epoll_wait+12)
#03 pc 0000000000011dd7 /system/lib/libutils.so (_ZN7android6Looper9pollInnerEi+118)
#04 pc 0000000000011cd3 /system/lib/libutils.so (_ZN7android6Looper8pollOnceEiPiS1_PPv+26)
#05 pc 0000000000092a7d /system/lib/libandroid_runtime.so (_ZN7android18NativeMessageQueue8pollOnceEP7_JNIEnvP8_jobjecti+22)
#06 pc 000000000064f5d5 /system/framework/arm/boot-framework.oat (Java_android_os_MessageQueue_nativePollOnce__JI+96)
at android.os.MessageQueue.nativePollOnce (Native method)
at android.os.MessageQueue.next (MessageQueue.java:323)
at android.os.Looper.loop (Looper.java:142)
at android.app.ActivityThread.main (ActivityThread.java:6393)
at java.lang.reflect.Method.invoke! (Native method)
at com.android.internal.os.ZygoteInit$MethodAndArgsCaller.run (ZygoteInit.java:933)
at com.android.internal.os.ZygoteInit.main (ZygoteInit.java:823)

Input dispatching timed out (Waiting to send non-key event because the touched window has not finished processing certain input events that were delivered to it over 500.0ms ago. Wait queue length: 14. Wait queue head age: 10535.3ms.)

#00 pc 000000000004aad4 /system/lib/libc.so (__epoll_pwait+20)
#01 pc 000000000001b74d /system/lib/libc.so (epoll_pwait+60)
#02 pc 000000000001b77d /system/lib/libc.so (epoll_wait+12)
#03 pc 0000000000010035 /system/lib/libutils.so (android::Looper::pollInner(int)+120)
#04 pc 000000000000ff25 /system/lib/libutils.so (android::Looper::pollOnce(int, int*, int*, void**)+32)
#05 pc 00000000000b5fe9 /system/lib/libandroid_runtime.so (android::NativeMessageQueue::pollOnce(_JNIEnv*, _jobject*, int)+24)
#06 pc 0000000000191155 /system/framework/arm/boot-framework.oat (Java_android_os_MessageQueue_nativePollOnce__JI+92)
at android.os.MessageQueue.nativePollOnce (Native method)
at android.os.MessageQueue.next (MessageQueue.java:325)
at android.os.Looper.loop (Looper.java:142)
at android.app.ActivityThread.main (ActivityThread.java:6651)
at java.lang.reflect.Method.invoke (Native method)
at com.android.internal.os.RuntimeInit$MethodAndArgsCaller.run (RuntimeInit.java:438)
at com.android.internal.os.ZygoteInit.main (ZygoteInit.java:810)


For last time I did try next next things to reduce amount of this ANRs:

  • Test without NE (Find only the ANR belong to Ads NE)
  • Compress sounds size. No effect.
  • Removed one 4x4 atlas and one 4x2. For now in game I have one or two 4x2 atlas and other atlasses are 2x2. No effect.
  • Async loading of main collection, using monarch, using flow. Optimized loading of resource files. No effect.
  • Separate GUI windows to other collections to async load them. (Reduced resource loading on main collection). No effect.
  • Finded the json decode performance bug. We implemented own NE to encode/decode JSON. No effect. But bug was fixed - this is nice moment :slight_smile:

All of this have no affect to ANR count. Next question is what I can do to reduce ANR from my side? What I can help to Defold Team to reduce it or find the problem?
Note, what I always use newest defold version

Can you tell, if you have the same ANR problem and percentage? On differents projects it have different value, so seems what it is depends from the client game code. Trying to find where it is, but failed :slight_smile:


So, let’s discuss what we can do?

4 Likes

#2

This would be interesting to get some statistics on! Anyone know of a site that shows ANR statistics?

Some things I’d do to dig into this problem to find a solution:

  • Could this be a garbage collection hitch? Enabled the debuggable flag in game.project Android section, bundle and install your game and launch it, then launch Android Studio and bring up the memory monitor for the Family Age process. Do you see a saw-tooth pattern indicating that you collect garbage very rapidly?
    • If you see the saw-tooth pattern it is an indication that you generate a lot of garbage every frame. Start looking at running scripts and try to figure out if you have game logic that creates many short lived objects (Lua tables, v3, v4, matrix etc).
  • Could this be the result of some script taking a lot longer than expected to run? Use our web profiler to see if you have life-cycle functions that take a lot of time to execute and then dig in using ProFi .
  • Could this be I/O operations that take a long time to execute? IO is a lot slower on many devices compared to modern laptops/desktops.
    • When and where do you read/write data? Can you spread it over multiple frames? Use ProFi again.
  • Make sure to remove empty life-cycle functions from your scripts to avoid unnecessary calls from the engine to Lua
  • Disable accelerometer events if you don’t use them
1 Like

#3

Thanks for advices, some of them I will try soon.

Some time ago I already reduced memory allocation of every frame, and it is possible, what a little reduced ANR percentage, but not sure. I can dig into this more. I try to share info with GC on android later

ProFI nice instrument too, already profiling with this.
Digging into I/O operations show us the json bug, but for now seems problem is not in I/O operations.

0 Likes

#4

I took a look at some other games (King) and see numbers from 0.03% up to around 1%, this with a mix of engines. I have one Defold game at 0.23% without much focus spent specifically on reducing ANRs on that title.

@AGulev what kind of numbers do you have on the Solitaire game of yours? And maybe also Bring Me Cakes?

0 Likes

#5

Current ANR rate for Solitaire game:


Bring Me Cakes:

Solitaire:

0 Likes

#6

Ok, so 0.04% for Classic Solitaire is proof that it is not the Defold engine itself that is at fault, at least not when used like in Classic Solitaire.

Bring Me Cakes shows a higher percentage (0.29%), is that with DefUnityAds or any other larger native extensions?

So, the question is what the difference is between Family Age and Solitaire (and to some extent Bring Me Cakes)?

  • Are you using any native extensions? If so, which ones?
  • Are you using local push notifications?
  • Are you stripping parts of the engine using an app manifest? If so, what are you removing?
  • That memory investigation I mentioned before would be interesting to know more about.
1 Like

#7

As soon as we get a little more data we will as well join the discussion as we always have suffered from a bunch of ANR’s. Most of them immediately at the loading screen. We are releasing a new version today and will gather info.

3 Likes

#8

As I remember, most of ANR on Blastlands load was caused by this issue.

1 Like

#9

Both projects without NE,
BMC:

excludeLibs: [“record”,“vpx”,“profilerext”,“engine”]
excludeSymbols: [“ProfilerExt”]
libs: [“engine_release”,“record_null”]

Solitaire:

excludeLibs:[“physics”,“LinearMath”,“BulletDynamics”,“BulletCollision”,“Box2D”,“record”,“vpx”,“profilerext”,“facebookext”]
excludeJars: ["(.)/facebooksdk.jar","(.)/facebook_android.jar"]
excludeSymbols: [“ProfilerExt”,“FacebookExt”]
libs: [“physics_null”,“record_null”]

0 Likes

#10

Any ideas why BMC would show higher numbers? More on_input code? Heavy code in update(), garbage collection, HTTP requests?

0 Likes

#11

No Idea right now, but I’ll continue my investigation. I wanna release the new version of solitaire with some fixes and with new Defold (waiting for new version now).

2 Likes

#12

1.08% is current ANR rate after added “exit from app” button:

0 Likes

#13

Huh, so ANRs have gone up and not down? This is opposite of what we have seen for other projects.

You should be using sys.exit() (or the old “exit” message) and never use os.exit(). Can you please confirm this?

Potential known causes of ANRs:

  • Engine
    • HTTP requests on crappy connections. This is caused by getaddrinfo being synchronous. We are working on this issue in our current sprint.
    • Local notifications. I see this issue in some games, but it is not very frequent. Do you have any ANRs with com.defold.push.LocalNotificationReceiver
  • User code
    • Doing too much work in a single frame. Examples:
      • Calculations in Lua code that take too long to execute.
      • Doing a lot of I/O operations (I/O operations are typically quite slow on mobile devices).
      • Using factory.create() or collectionfactory.create() on a factory component where Load Dynamically is checked and you haven’t called (collection)factory.load() prior to instantiating the first instance.
  • Using os.exit() instead of sys.exit()

If anyone has more ANR causes, please share them here!

3 Likes

#14

I’m using

msg.post("@system:", "exit", {code = 0})

Full list of registered ANR:

0 Likes

#15

So, I removed exit button on android too and have next results:

With exit button:

Without exit button:

Just for statistics. Exit was with msg too

timer.delay(0.01, false, function()
    msg.post("@system:", "exit", {code = 0})
end)
2 Likes

#16

Thanks for sharing your statistics!
We think the reason for ANR is this known issue:

Exit from the application one of the most simple way to reproduce this ANR but not the only one.

@jhonny.goransson is working on this issue.

4 Likes

#17

Ys, it’s actively being worked on. It will unfortunately not get finished in time for the release on Monday. It will be included in 1.2.151 instead.

5 Likes

#18

It should work fine in 1.2.153 (DEF-3815)

2 Likes

#19

It seems like in my game count of ANR impacted session reduced 3 times.

2 Likes