Winbind and SAP application coredumps in __nscd_get_nl_timestamp()
This document (000019920) is provided subject to the disclaimer at the end of this document.
Environment
SUSE Linux Enterprise Server for SAP Applications 12
Situation
2021-02-10T09:47:39.720880+00:00 spps4hlapew03 systemd[1]: nscd.service: Main process exited, code=killed, status=9/KILL 2021-02-10T09:47:39.730474+00:00 spps4hlapew03 systemd[1]: nscd.service: Failed with result 'exit-code'.
The nscd crash is followed by winbindd and SAP processes core dumps. While checking the winbind core dump, the crash point seems to be when trying to access the persistent database file ( map->head->nscd_certainly_running ) on frame#8, as the mapping address seems to have been unmapped during nscd abrupt stop:
#glibc-2.22/nscd/nscd_gethst_r.c ------------------------------------- 101 __nscd_get_nl_timestamp (void) 102 { 103 uint32_t retval; 104 if (__nss_not_use_nscd_hosts != 0) 105 return 0; 106 107 /* __nscd_get_mapping can change hst_map_handle.mapped to NO_MAPPING. 108 However, __nscd_get_mapping assumes the prior value was not NO_MAPPING. 109 Thus we have to acquire the lock to prevent this thread from changing 110 hst_map_handle.mapped to NO_MAPPING while another thread is inside 111 __nscd_get_mapping. */ 112 if (!__nscd_acquire_maplock (&__hst_map_handle)) 113 return 0; 114 115 struct mapped_database *map = __hst_map_handle.mapped; 116 117 if (map == NULL 118 || (map != NO_MAPPING 119***> && map->head->nscd_certainly_running == 0 120 && map->head->timestamp + MAPPING_TIMEOUT < time (NULL))) 121 map = __nscd_get_mapping (GETFDHST, "hosts", &__hst_map_handle.mapped); 122 (gdb) bt #0 0x00007effbe9962a7 in __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:55 #1 0x00007effbe99767a in __GI_abort () at abort.c:78 #2 0x00007effc21dcf0e in dump_core () at ../source3/lib/dumpcore.c:338 #3 0x00007effc21ce247 in smb_panic_s3 (why=<optimized out>) at ../source3/lib/util.c:814 #4 0x00007effc4ec9ddf in smb_panic (why=why@entry=0x7effc4f11c84 "internal error") at ../lib/util/fault.c:166 #5 0x00007effc4ec9ff6 in fault_report (sig=<optimized out>) at ../lib/util/fault.c:83 #6 sig_fault (sig=<optimized out>) at ../lib/util/fault.c:94 #7 <signal handler called> #8 0x00007effbea7e419 in __nscd_get_nl_timestamp () at nscd_gethst_r.c:119 retval = <optimized out> map = 0x56344e6c7470 #9 0x00007effbea690ec in get_nl_timestamp () at ../sysdeps/unix/sysv/linux/check_pf.c:87 No locals. #10 cache_valid_p () at ../sysdeps/unix/sysv/linux/check_pf.c:98 timestamp = <optimized out> timestamp = <optimized out> #11 __check_pf (seen_ipv4=seen_ipv4@entry=0x7fff44fe0eb2, seen_ipv6=seen_ipv6@entry=0x7fff44fe0eb3, in6ai=in6ai@entry=0x7fff44fe0ec0, in6ailen=in6ailen@entry=0x7fff44fe0ec8) at ../sysdeps/unix/sysv/linux/check_pf.c:304 olddata = 0x0 data = 0x0 #12 0x00007effbea3b911 in __GI_getaddrinfo (name=0x56344e6d5f20 "172.16.1.5", service=0x7fff44fe10e0 "88", service@entry=0x0, hints=0x7fff44fe10b0, hints@entry=0x1, pai=0x7fff44fe10a8, pai@entry=0x56344e6e0ba0) at ../sysdeps/posix/getaddrinfo.c:2374 #13 0x00007effb9d24475 in system_getaddrinfo (res=res@entry=0x56344e6e0ba0, hint=hint@entry=0x1, serv=serv@entry=0x0, name=<optimized out>) at fake-addrinfo.c:1360 #14 my_fake_getaddrinfo (result=result@entry=0x56344e6e0ba0, hint=hint@entry=0x1, serv=serv@entry=0x0, name=<optimized out>) at fake-addrinfo.c:1161 #15 krb5int_getaddrinfo (node=<optimized out>, service=service@entry=0x7fff44fe10e0 "88", hints=hints@entry=0x7fff44fe10b0, aip=aip@entry=0x7fff44fe10a8) at fake-addrinfo.c:1361 #16 0x00007effbf5e0c25 in resolve_server (servers=0x0, conns=0x7fff44fe1090, udpbufp=0x7fff44fe10a0, message=0x7fff44fe1270, socktype2=1, socktype1=2, ind=0, context=0x56344e6dc6c0) at sendto_kdc.c:579 #17 k5_sendto (context=context@entry=0x56344e6dc6c0, message=message@entry=0x7fff44fe1270, servers=servers@entry=0x7fff44fe11e0, socktype1=socktype1@entry=2, socktype2=socktype2@entry=1, callback_info=callback_info@entry=0x0, reply=reply@entry=0x7fff44fe1280, remoteaddr=remoteaddr@entry=0x0, remoteaddrlen=remoteaddrlen@entry=0x0, server_used=server_used@entry=0x7fff44fe11dc, msg_handler=msg_handler@entry=0x7effbf5dfbb0 <check_for_svc_unavailable>, msg_handler_data=msg_handler_data@entry=0x7fff44fe11d8) at sendto_kdc.c:1037 #18 0x00007effbf5e133a in krb5_sendto_kdc (context=context@entry=0x56344e6dc6c0, message=message@entry=0x7fff44fe1270, realm=realm@entry=0x7fff44fe1290, reply=reply@entry=0x7fff44fe1280, use_master=use_master@entry=0x7fff44fe126c, tcp_only=tcp_only@entry=0) at sendto_kdc.c:218 #19 0x00007effbf5b712c in k5_init_creds_get (context=context@entry=0x56344e6dc6c0, ctx=0x56344e6d5840, use_master=use_master@entry=0x7fff44fe1408) at get_in_tkt.c:544 #20 0x00007effbf5b725d in k5_get_init_creds (context=context@entry=0x56344e6dc6c0, creds=creds@entry=0x7fff44fe2700, client=client@entry=0x56344e6dc630, prompter=prompter@entry=0x7effc0afdae0 <kerb_prompter>, prompter_data=prompter_data@entry=0x7fff44fe3538, start_time=start_time@entry=0, in_tkt_service=in_tkt_service@entry=0x0, options=options@entry=0x56344e6dd270, gak_fct=gak_fct@entry=0x7effbf5b85d0 <krb5_get_as_key_password>, gak_data=gak_data@entry=0x7fff44fe1480, use_master=use_master@entry=0x7fff44fe1408, as_reply=as_reply@entry=0x7fff44fe1420) at get_in_tkt.c:1782 (gdb) frame 8 #8 0x00007effbea7e419 in __nscd_get_nl_timestamp () at nscd_gethst_r.c:119 119 && map->head->nscd_certainly_running == 0
The counter is > 0, while the persistent database header mapping seems to be nonexistent or previously unmapped:
(gdb) p ((struct mapped_database *)0x56344e6c7470)->counter $4 = 1 (gdb) p *(((struct mapped_database *)0x56344e6c7470)->head) Cannot access memory at address 0x7effb729f000 (gdb) x /8xw 0x7effb729f000 0x7effb729f000: Cannot access memory at address 0x7effb729f000
While checking the mappings, we can see the persistent db file shared mapping marked as deleted:
(gdb) info proc mappings mapped address spaces: Start Addr End Addr Size Offset objfile 0x7effb729f000 0x7effb72d4000 0x35000 0x0 /run/nscd/dbYFJari (deleted)
The crash point of the SAP processes is also during persistent db file access in __nscd_get_map_ref():
Message: Process 14763 (gwrd) of user 1001 dumped core. Stack trace of thread 13071: #0 0x00007f53b6a4e6c6 __nscd_get_map_ref (libc.so.6) #1 0x00007f53b6a4ba66 nscd_gethst_r (libc.so.6) #2 0x00007f53b6a2d08f gethostbyaddr_r@@GLIBC_2.2.5 (libc.so.6) #3 0x00007f53b6a347c2 getnameinfo (libc.so.6) #4 0x000055764eaa606c _Z16NiPGetHostByAddrPK11NI_NODEADDRhPDsjPP8_IO_FILE (gwrd) #5 0x000055764ea0cc11 _ZN14NIHIMPL_LINEAR11getHostNameEPK11NI_NODEADDRPDsjhjPP8_IO_FILE (gwrd) #6 0x000055764e9f90ba _Z14NiIGetHostNamePK11NI_NODEADDRPDsjhjPP8_IO_FILE (gwrd) #7 0x000055764ea992f4 _Z12GwAddrToHostP11NI_NODEADDRPDsj (gwrd) #8 0x000055764e960951 _Z12GwRqDpSendToP11REQUEST_BUFiiihP15DP_SESSION_INFOPi (gwrd) #9 0x000055764ea21d17 _Z13GwRemGwHandlei (gwrd) #10 0x000055764e98d0dd _ZL6GwLoopv (gwrd) #11 0x000055764ea4780f nlsui_main (gwrd) #12 0x000055764e959f1a main (gwrd) #13 0x00007f53b694fa35 __libc_start_main (libc.so.6) #14 0x000055764ea20d4d _start (gwrd)
Resolution
Status
Disclaimer
This Support Knowledgebase provides a valuable tool for SUSE customers and parties interested in our products and solutions to acquire information, ideas and learn from one another. Materials are provided for informational, personal or non-commercial use within your organization and are presented "AS IS" WITHOUT WARRANTY OF ANY KIND.
- Document ID:000019920
- Creation Date: 18-Mar-2021
- Modified Date:19-Mar-2021
-
- SUSE Linux Enterprise Server
- SUSE Linux Enterprise Server for SAP Applications
For questions or concerns with the SUSE Knowledgebase please contact: tidfeedback[at]suse.com