{"id":3388,"date":"2020-08-17T04:03:47","date_gmt":"2020-08-16T20:03:47","guid":{"rendered":"http:\/\/www.chenlianfu.com\/?p=3388"},"modified":"2025-08-04T00:24:14","modified_gmt":"2025-08-03T16:24:14","slug":"ceph%e6%95%85%e9%9a%9c%e4%bb%a5%e5%85%b6%e5%a4%84%e7%90%86%e6%96%b9%e6%b3%95","status":"publish","type":"post","link":"http:\/\/www.chenlianfu.com\/?p=3388","title":{"rendered":"CEPH\u6545\u969c\u4ee5\u5176\u5904\u7406\u65b9\u6cd5"},"content":{"rendered":"\n<h2>1. Slow OSD heartbeats<\/h2>\n\n\n\n<pre class=\"wp-block-code\"><code># ceph -s\nhealth: HEALTH_WARN\n       Slow OSD heartbeats on back (longest 6181.010ms)\n       Slow OSD heartbeats on front (longest 5953.232ms)<\/code><\/pre>\n\n\n\n<p>OSDs\u4e4b\u95f4\u4f1a\u76f8\u4e92\u6d4b\u8bd5\uff08ping\uff09\u8bbf\u95ee\u901f\u5ea6\uff0c\u82e5\u4e24\u4e2aOSDs\u4e4b\u95f4\u7684\u8fde\u63a5\u5ef6\u8fdf\u9ad8\u4e8e1s\uff0c\u5219\u8868\u793aOSDs\u4e4b\u95f4\u7684\u5ef6\u8fdf\u592a\u9ad8\uff0c\u4e0d\u5229\u4e8eCEPH\u96c6\u7fa4\u7684\u6570\u636e\u5b58\u50a8\u548c\u8bbf\u95ee\u3002\u4e24\u4e2aOSDs\u4e4b\u95f4\u53ef\u4ee5\u901a\u8fc7\u5185\u7f51\uff08\u5b58\u50a8\u670d\u52a1\u5668\u4e4b\u95f4 \/ back\uff09\u68c0\u6d4b\u5176\u5ef6\u8fdf\uff0c\u4e5f\u53ef\u4ee5\u901a\u8fc7\u5916\u7f51\uff08\u5b58\u50a8\u670d\u52a1\u5668\u5230\u4f7f\u7528\u670d\u52a1\u5668 \/ front\uff09\u68c0\u6d4b\u5176\u5ef6\u8fdf\u3002\u82e5\u5ef6\u8fdf\u8fc7\u9ad8\uff0c\u4f1a\u5c06\u76f8\u5e94\u7684OSDs down\u6389\uff0c\u8fdb\u800c\u53ef\u80fd\u5bfc\u81f4CEPH\u6570\u636e\u4e22\u5931\u3002<\/p>\n\n\n\n<p>\u4e00\u822c\u60c5\u51b5\u4e0bOSDs\u4e4b\u95f4\u5ef6\u8fdf\u9ad8\u7684\u539f\u56e0\u662f\u56e0\u4e3a\u7f51\u7edc\u539f\u56e0\u5bfc\u81f4\u7684\u3002\u53ef\u80fd\u662f\u67d0\u53f0\u5b58\u50a8\u670d\u52a1\u5668\u91cd\u542f\u7f51\u7edc\u5bfc\u81f4\uff0c\u6216\u7f51\u7ebf\u51fa\u95ee\u9898\u5bfc\u81f4\u3002\u524d\u8005\u7684\u65f6\u95f4\u4f1a\u9010\u6e10\u53d8\u5c0f\uff0c\u6700\u540e\u6062\u590d\u6b63\u5e38\uff0c\u540e\u8005\u5219\u95ee\u9898\u4e00\u76f4\u5b58\u5728\u3002<a rel=\"noreferrer noopener\" href=\"https:\/\/ceph.readthedocs.io\/en\/latest\/rados\/operations\/monitoring\/#network-performance-checks\" target=\"_blank\">\u901a\u8fc7\u67e5\u770b\u8be6\u7ec6\u7684OSDs\u5ef6\u8fdf\u4fe1\u606f\u67e5\u627e\u5ef6\u8fdf\u8f83\u9ad8\u7684\u4e3b\u673a\uff0c\u518d\u8fdb\u884c\u89e3\u51b3\u3002<\/a><\/p>\n\n\n\n<pre class=\"wp-block-code\"><code># ceph health detail\n\n&#91;WRN] OSD_SLOW_PING_TIME_BACK: Slow OSD heartbeats on back (longest 11846.602ms)\n    Slow OSD heartbeats on back from osd.12 &#91;] to osd.25 &#91;] 11846.602 msec\n    Slow OSD heartbeats on back from osd.8 &#91;] to osd.17 &#91;] 3617.281 msec\n    Slow OSD heartbeats on back from osd.16 &#91;] to osd.27 &#91;] 2784.517 msec\n    Slow OSD heartbeats on back from osd.21 &#91;] to osd.17 &#91;] 1678.064 msec\n    Slow OSD heartbeats on back from osd.11 &#91;] to osd.15 &#91;] 1675.884 msec\n    Slow OSD heartbeats on back from osd.20 &#91;] to osd.13 &#91;] 1073.790 msec\n&#91;WRN] OSD_SLOW_PING_TIME_FRONT: Slow OSD heartbeats on front (longest 11427.677ms)\n    Slow OSD heartbeats on front from osd.12 &#91;] to osd.25 &#91;] 11427.677 msec\n    Slow OSD heartbeats on front from osd.8 &#91;] to osd.17 &#91;] 3787.868 msec\n    Slow OSD heartbeats on front from osd.16 &#91;] to osd.27 &#91;] 3465.298 msec\n    Slow OSD heartbeats on front from osd.11 &#91;] to osd.15 &#91;] 1469.591 msec\n    Slow OSD heartbeats on front from osd.21 &#91;] to osd.17 &#91;] 1341.135 msec\n    Slow OSD heartbeats on front from osd.20 &#91;] to osd.13 &#91;] 1224.235 msec\n    Slow OSD heartbeats on front from osd.5 &#91;] to osd.16 &#91;] 1101.175 msec\n\n\u901a\u8fc7\u4ee5\u4e0a\u4fe1\u606f\u67e5\u770b\uff0c\u53ef\u4ee5\u53d1\u73b0\u6709\u4e00\u53f0\u4e3b\u673a\u548c\u5176\u5b83\u4e3b\u673a\u7684OSDs\u5ef6\u8fdf\u90fd\u6bd4\u8f83\u9ad8\uff0c\u5c06\u8be5\u4e3b\u673a\u7684\u5149\u7ea4\u7f51\u7ebf\u62d4\u4e0b\u64e6\u62ed\u5e72\u51c0\u5e76\u91cd\u65b0\u63d2\u4e0a\u5f97\u4ee5\u89e3\u51b3\u3002<\/code><\/pre>\n\n\n\n<h2>2. slow ops<\/h2>\n\n\n\n<pre class=\"wp-block-code\"><code># ceph -s\n     21 slow ops, oldest one blocked for 29972 sec, mon.ceph1 has slow ops<\/code><\/pre>\n\n\n\n<p>\u5148\u4fdd\u8bc1\u6240\u6709\u5b58\u50a8\u670d\u52a1\u5668\u4e0a\u7684\u65f6\u95f4\u540c\u6b65\u4e00\u81f4\uff0c\u518d\u91cd\u542f\u76f8\u5e94\u4e3b\u673a\u4e0a\u7684moniter\u670d\u52a1\u89e3\u51b3\u3002<\/p>\n\n\n\n<h2>3. pgs not deep-scrubbed in time<\/h2>\n\n\n\n<pre class=\"wp-block-code\"><code># ceph -s\n    47 pgs not deep-scrubbed in time<\/code><\/pre>\n\n\n\n<p>CEPH\u7cfb\u7edf\u6bcf1~1.5\u5929\u5bf9\u6240\u6709\u7684PGs\u8fdb\u884c\u4e00\u81f4\u6027\u68c0\u9a8c\uff08scrub\uff0c\u901a\u8fc7\u6587\u4ef6\u7684\u5143\u6570\u636e\u4fe1\u606f\u68c0\u9a8cPG\u5728\u5404OSDs\u4e0a\u5bf9\u5e94\u7684\u5bf9\u4efd\u6570\u636e\u662f\u5426\u5b8c\u6574\u4e00\u81f4\uff0c\u901f\u5ea6\u5f88\u5feb\uff09\u3002\u5bf9\u6bcf\u4e2aPG\u8fdb\u884c\u4e00\u81f4\u6027\u68c0\u9a8c\u65f6\uff0c\u6709\u4e00\u5b9a\u6982\u7387\u8f6c\u6362\u4e3a\u6df1\u5ea6\u4e00\u81f4\u6027\u68c0\u9a8c\uff08deep-scrub\uff0c\u901a\u8fc7\u6587\u4ef6\u5185\u5bb9\u6765 \u68c0\u9a8cPG\u5728\u5404OSDs\u4e0a\u5bf9\u5e94\u7684\u5bf9\u4efd\u6570\u636e\u662f\u5426\u5b8c\u6574\u4e00\u81f4\uff0c\u901f\u5ea6\u5f88\u6162\uff0c\u975e\u5e38\u6d88\u8017\u78c1\u76d8\u8bfb\u53d6\u6027\u80fd \uff09\u3002\u82e5\u8bbe\u7f6e\u6982\u7387\u4e3a5%\uff0c\u5219\u9700\u898120~30\u5929\u624d\u80fd\u5bf9\u6240\u6709\u7684PGs\u8fdb\u884c\u6df1\u5ea6\u4e00\u81f4\u6027\u68c0\u9a8c\u3002\u800c\u9ed8\u8ba4\u7684osd_deep_scrub_interval\u9608\u503c\u4e3a1\u5468\uff0c\u5f53\u6709PGs\u8d85\u8fc72\u5468\u672a\u80fd\u8fdb\u884cdeep-scrub\u65f6\uff0c\u5219\u5176\u4f1a\u8fdb\u884c\u8b66\u62a5\u3002\u4fee\u6539\u8be5\u53c2\u6570\u9608\u503c\u4e3a\u4e00\u4e2a\u6708\u65f6\u95f4(60*60*24*30)\uff0c\u4ece\u800c\u6d88\u9664\u8b66\u544a\u3002<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>ceph config set global osd_deep_scrub_interval 2592000 <\/code><\/pre>\n\n\n\n<h2>4. MDS cache is too large<\/h2>\n\n\n\n<pre class=\"wp-block-preformatted\">ceph config set mds mds_cache_memory_limit 10GB\n\nceph config dump<\/pre>\n\n\n\n<p>\u5f53MDS\u4f7f\u7528\u7684\u7f13\u5b58\u8fc7\u9ad8\uff0c\u6bd4\u8bbe\u5b9a\u7684\u9608\u503c\u9ad8\u5f88\u591a\u65f6\uff0c\u5219\u6709\u6b64\u8b66\u544a\u4fe1\u606f\u3002\u4f7f\u7528\u5982\u4e0a\u547d\u4ee4\u8bbe\u7f6e\u66f4\u9ad8\u7684MDS\u7f13\u5b58\u9608\u503c\uff0c\u5373\u53ef\u6d88\u9664\u6b21\u8b66\u544a\u4fe1\u606f\uff0c\u4f46\u4f1a\u6d88\u8017\u66f4\u591a\u7684\u5185\u5b58\u3002\u4f7f\u7528config dump\u547d\u4ee4\u53ef\u4ee5\u67e5\u770b\u5404\u9879\u53c2\u6570\u9608\u503c\u4fe1\u606f\u3002<\/p>\n\n\n\n<p>\u6b64\u5916\uff0c\u53ef\u80fd\u589e\u5927\u4e86mds_cache_memory_limit\u53c2\u6570\u540e\uff0c\u8fc7\u4e86\u4e00\u6bb5\u65f6\u95f4\u540e\u4ecd\u7136\u63d0\u793a\u8be5\u8b66\u544a\uff0c\u68c0\u6d4b\u53d1\u73b0MDS\u7f13\u5b58\u4f7f\u7528\u53c8\u8d85\u8fc7\u65b0\u8bbe\u5b9a\u503c\u76841.5\u500d\u5927\u5c0f\u4e86\u3002\u6b64\u65f6\uff0c\u53ef\u4ee5\u8003\u8651\u8bbe\u7f6e\u591a\u4e2a\u6d3b\u52a8\u72b6\u6001\u7684MDS\u670d\u52a1\u3002<\/p>\n\n\n\n<pre class=\"wp-block-preformatted\"># \u5148\u5f00\u542f3\u53f0\u670d\u52a1\u5668\u7684MDS\u670d\u52a1\uff0c\u786e\u4fdd\u8fd93\u53f0\u670d\u52a1\u5668\u7684\u5185\u5b58\u662f\u591f\u7528\u7684\uff0c\u6700\u597d\u8fd93\u53f0\u670d\u52a1\u5668\u7684\u5185\u5b58\u66f4\u5927\u3002\nceph orch apply mds cephfs ceph106,ceph107,ceph109\nceph fs set cephfs max_mds 3\n\n# \u7531\u4e8e\u6fc0\u6d3b\u4e863\u53f0\u670d\u52a1\u5668\u7684MDS\uff0c\u7f3a\u5c11\u5907\u7528\u7684MDS\u670d\u52a1\u3002\u518d\u589e\u52a0\u4e00\u4e2a\u5907\u7528\u7684MDS\u670d\u52a1\u4e3b\u673a\u3002\nceph orch apply mds cephfs ceph106,ceph107,ceph109,ceph110<\/pre>\n\n\n\n<h2>5. Client node18 failing to respond to cache pressure<\/h2>\n\n\n\n<p>\u8868\u793anode18\u4e3b\u673a\u548cMDS\u670d\u52a1\u4e4b\u524d\u7684\u54cd\u5e94\u8f83\u6162\uff0c\u82e5\u8fc7\u4e00\u4f1a\u513f\u5c31\u663e\u793ahealth_ok\uff0c\u5219\u4e0d\u7528\u7ba1\u5b83\u3002\u82e5\u662f\u957f\u671f\u663e\u793a\u8be5\u8b66\u544a\uff0c\u5219\u5728\u5bf9\u5e94\u7684node18\u4e3b\u673a\u4e0a\u5378\u8f7dceph\u6587\u4ef6\u7cfb\u7edf\u540e\u91cd\u65b0\u6302\u8f7d\u5373\u53ef\u3002<\/p>\n\n\n\n<p>\u5ba2\u6237\u7aef\u5728\u4f7f\u7528\u76f8\u5e94\u6570\u636e\u65f6\uff0cMDS\u670d\u52a1\u7aef\u5219\u5c06\u5176\u6570\u636e\u7f13\u5b58\u5230\u670d\u52a1\u5668\u7684\u5185\u5b58\u4e2d\u3002\u5f53MDS\u670d\u52a1\u7aef\u9700\u8981\u51cf\u5c11\u7f13\u5b58\u6d88\u8017\u65f6\uff0c\u5219\u4f1a\u7ed9\u5ba2\u6237\u7aef\u53d1\u9001\u76f8\u5e94\u7684\u8bf7\u6c42\u3002\u6b64\u65f6\uff0c\u5ba2\u6237\u7aef\u54cd\u5e94\u8fc7\u6162\uff0c\u5219\u63d0\u793a\u6b64\u8b66\u544a\u4fe1\u606f\u3002\u82e5\u4e00\u76f4\u5982\u6b64\uff0c\u4f1a\u5bfc\u81f4MDS\u670d\u52a1\u5668\u7f13\u5b58\u65e0\u6cd5\u91ca\u653e\uff0c\u5185\u5b58\u6d88\u8017\u6301\u7eed\u589e\u52a0\u751a\u81f3\u5bfc\u81f4\u5b95\u673a\u3002<\/p>\n\n\n\n<p>ceph\u96c6\u7fa4\u63d0\u4f9b\u5143\u6570\u636e\u670d\u52a1\uff0c\u5219\u5ba2\u6237\u7aef\u53ef\u4ee5\u63d0\u6302\u8f7dceph\u6587\u4ef6\u7cfb\u7edf\u3002\u5ba2\u6237\u7aef\u8bbf\u95ee\u6570\u636e\u65f6\uff0c\u5219\u5728\u5ba2\u6237\u7aef\u548c\u5143\u6570\u636e\u670d\u52a1\u5668\u4e2d\u90fd\u7f13\u5b58\u76f8\u5e94\u7684\u6570\u636e\u3002\u5143\u6570\u636e\u670d\u52a1\u5668\u4f1a\u548c\u5ba2\u6237\u7aefinode\u5360\u7528\u60c5\u51b5\u6765\u6d88\u51cf\u7f13\u5b58\u3002<a rel=\"noreferrer noopener\" href=\"https:\/\/docs.ceph.com\/en\/latest\/cephfs\/cache-configuration\/#mds-recall\" target=\"_blank\">\u5f53\u5ba2\u6237\u7aef\u54cd\u5e94\u592a\u6162\uff0c\u5219\u4f1a\u62a5\u9519\u201cfailing to respond to cache pressure\u201d or MDS_HEALTH_CLIENT_RECALL<\/a>\u3002\u82e5\u786e\u5b9e\u662f\u5ba2\u6237\u7aef\u8d1f\u8377\u8f83\u5927\uff0c\u662f\u6b63\u5e38\u8bfb\u5199\u64cd\u4f5c\uff0c\u53ef\u4ee5\u8003\u8651\u589e\u5927mds_recall_warning_decay_rate\u53c2\u6570\u7684\u503c\uff08\u9ed8\u8ba4\u4e3a60s\uff09\uff0c\u4ece\u800c\u6d88\u9664\u8b66\u544a\u3002<\/p>\n\n\n\n<p>\u53ef\u4ee5\u67e5\u8be2ceph\u5ba2\u6237\u7aef\u7684ID\u53f7\u53ca\u5176\u4f7f\u7528inode\u6570\uff08num_caps\u7684\u503c\uff09\u3002<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>ceph tell mds.0 session ls<\/code><\/pre>\n\n\n\n<p>\u8c28\u614e\u4f7f\u7528\u5982\u4e0b\u547d\u4ee4\u8e22\u51fa\u76ee\u6807\u5ba2\u6237\u7aef\u6216\u5168\u90e8\u5ba2\u6237\u7aef\u3002<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>ceph tell mds.0 session evict id=11134635\nceph tell mds.0 session evict<\/code><\/pre>\n\n\n\n<p>\u8e22\u51fa\u5ba2\u6237\u7aef\u662f\u5c06\u5ba2\u6237\u7aef\u52a0\u5165\u4e86\u9ed1\u540d\u5355\uff0c\u53ef\u4ee5\u4f7f\u7528\u5982\u4e0b\u547d\u4ee4\u67e5\u770b\u9ed1\u540d\u5355\u4fe1\u606f\u6216\u79fb\u51fa\u9ed1\u540d\u5355\u3002\u867d\u7136\u79fb\u51fa\u9ed1\u540d\u5355\uff0c\u53ef\u80fd\u8fd8\u4e0d\u80fd\u8ba9\u5ba2\u6237\u7aef\u6b63\u5e38\u6302\u8f7dceph\u6587\u4ef6\u7cfb\u7edf\uff0c\u56e0\u6b64\u9700\u8981\u8c28\u614e\u5904\u7406\u3002<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>ceph osd blacklist ls\nceph osd blacklist rm 192.168.20.1:0\/1498586492\nceph osd blacklist clear<\/code><\/pre>\n\n\n\n<h2>6. <a href=\"https:\/\/docs.ceph.com\/en\/latest\/rados\/operations\/health-checks\/?highlight=incomplete#pg-availability\" target=\"_blank\" rel=\"noreferrer noopener\">Reduced data availability: 4 pgs inactive, 4 pgs incomplete<\/a><\/h2>\n\n\n\n<p>\u5f53\u6709pgs\u51fa\u73b0incomplete\u65f6\uff0c\u8868\u660epgs\u5bf9\u5e94\u7684OSDs\u5b58\u6d3b\u6570\u91cf\u5c11\u4e8e\u6700\u5c0f\u526f\u672c\u6570\u3002\u56e0\u6b64\uff0c\u5176\u5bf9\u5e94\u7684\u6570\u636e\u65e0\u6cd5\u8bfb\u5199\uff0c\u5904\u4e8ereduced\u72b6\u6001\uff0c\u4f1a\u5bfc\u81f4MDS\u670d\u52a1\u51fa\u95ee\u9898\uff0c\u63d0\u793a\u5982\u4e0b\u62a5\u9519\u4fe1\u606f\uff0c\u793a\u4f8b\uff1a<\/p>\n\n\n\n<pre class=\"wp-block-preformatted\">3 MDSs report slow metadata IOs\n2 MDSs report slow requests\n2 MDSs behind on trimming\nReduced data availability: 4 pgs inactive, 4 pgs incomplete\n\npg 5.6de is incomplete, acting [254,356,222,352,111,247,100,133,351,206] (reducing pool cephfs_data min_size from 8 may help; search ceph.com\/docs for 'incomplete')\npg 5.6e9 is incomplete, acting [276,244,357,358,221,321,311,229,314,351] (reducing pool cephfs_data min_size from 8 may help; search ceph.com\/docs for 'incomplete')\npg 5.73b is incomplete, acting [186,279,351,247,293,354,359,220,181,283] (reducing pool cephfs_data min_size from 8 may help; search ceph.com\/docs for 'incomplete')\npg 5.eda is incomplete, acting [164,157,120,227,353,351,295,269,95,354] (reducing pool cephfs_data min_size from 8 may help; search ceph.com\/docs for 'incomplete')<\/pre>\n\n\n\n<p>\u6b64\u65f6\uff0c\u9700\u8981\u4fee\u590dpgs\u3002<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code># \u67e5\u8be2pg\u4fe1\u606f\uff08pg id \u4e3a 5.6de\uff09\nceph pg 5.6de query\n\n# \u5f3a\u884c\u91cd\u5efapg\nceph osd force-create-pg 5.6de --yes-i-really-mean-it\n\n<\/code><\/pre>\n\n\n\n<h2>7. failed to probe daemons or devices stderr:Non-zero exit code 125 from \/bin\/podman<\/h2>\n\n\n\n<p>\u7531\u4e8eCeph\u5b58\u50a8\u96c6\u7fa4\u4e2d\u4e2a\u522b\u670d\u52a1\u5668\u7684podman\u5bb9\u5668\u51fa\u95ee\u9898\uff0c\u5bfc\u81f4\u76f8\u5e94\u670d\u52a1\u542f\u52a8\u5931\u8d25\u3002\u62a5\u544a\u8b66\u544a\u5982\u4e0b\uff1a<\/p>\n\n\n\n<pre class=\"wp-block-preformatted\">[WRN] CEPHADM_REFRESH_FAILED: failed to probe daemons or devices<br>host ceph105 ceph-volume inventory failed: cephadm exited with an error code: 1, stderr:Non-zero exit code 125 from \/bin\/podman run --rm --ipc=host --net=host --entrypoint stat -e CONTAINER_IMAGE=docker.io\/ceph\/ceph:v15 -e NODE_NAME=ceph105 docker.io\/ceph\/ceph:v15 -c %u %g \/var\/lib\/ceph<br>stat:stderr Error: readlink \/var\/lib\/containers\/storage\/overlay\/l\/HMGABIBEWBRXOSBT4JLOKQIKDA: no such file or directory<br>Traceback (most recent call last):<br>File \"\", line 6112, in<br>File \"\", line 1299, in _infer_fsid<br>File \"\", line 1382, in _infer_image<br>File \"\", line 3581, in command_ceph_volume<br>File \"\", line 1477, in make_log_dir<br>File \"\", line 2084, in extract_uid_gid<br>RuntimeError: uid\/gid not found<\/pre>\n\n\n\n<p>\u6267\u884c\u4ee5\u4e0b\u547d\u4ee4\u65f6\uff0c\u4f1a\u6709\u5982\u4e0a\u62a5\u9519\u3002\u800c\u6b63\u5e38\u7684\u5b58\u50a8\u8282\u70b9\u5219\u4e0d\u4f1a\u62a5\u9519\u3002<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>cephadm shell<\/code><\/pre>\n\n\n\n<p>\u8be5\u7c7b\u62a5\u9519\u8868\u793apodman\u7684docker\u5bb9\u5668\u51fa\u9519\u3002\u67e5\u627e\u51fa\u9519\u7684\u5b58\u50a8\u8282\u70b9\uff1a<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>ceph orch ps | grep error<\/code><\/pre>\n\n\n\n<p>\u5728\u5404\u5b58\u50a8\u8282\u70b9\u91cd\u65b0pull\u76f8\u5e94\u7684docker\u955c\u50cf\uff1a<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>cephadm pull\npodman pull ceph\/ceph:v15\n# \u4ee5\u4e0a\u4e24\u4e2a\u547d\u4ee4\u90fd\u53ef\u4ee5\u8fbe\u5230\u76ee\u7684\uff0c\u540e\u8005\u80fd\u770b\u5230\u4e0b\u8f7d\u7684\u901f\u5ea6\uff0c\u4ee5\u514d\u7b49\u5f85\u8f83\u957f\u65f6\u95f4\u4e0b\u8f7d\u51e0\u767eM\u7684\u6587\u4ef6\u800c\u4e0d\u6e05\u695a\u8fdb\u5ea6\u3002\n# \u91cd\u65b0pull\u955c\u50cf\u540e\uff0c\u4f1a\u63d0\u5347ceph\u7248\u672c\u3002\u4e0d\u4f1a\u5f71\u54cd\u4f7f\u7528<\/code><\/pre>\n\n\n\n<p>\u68c0\u67e5podman\u7684docker\u955c\u50cf<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>podman images\npodman ps<\/code><\/pre>\n\n\n\n<p>\u6700\u540e\u91cd\u542f\u670d\u52a1\u5668\u6216\u91cd\u542fCEPH\u670d\u52a1\u3002<\/p>\n\n\n\n<h2>8. mds.cephfs.ceph109.avzzqn(mds.1): Behind on trimming (594\/128) max_segments: 128, num_segments: 594<\/h2>\n\n\n\n<p>\u6709MDS\u670d\u52a1\u5668\u62a5\u8b66\uff1a<\/p>\n\n\n\n<pre class=\"wp-block-preformatted\">[WRN] MDS_TRIM: 2 MDSs behind on trimming<br>mds.cephfs.ceph109.avzzqn(mds.1): Behind on trimming (594\/128) max_segments: 128, num_segments: 594<br>mds.cephfs.ceph106.hggsge(mds.0): Behind on trimming (259\/128) max_segments: 128, num_segments: 259<\/pre>\n\n\n\n<p>MDS\u670d\u52a1\u5668\u5c06\u5143\u6570\u636e\u4ee5segments(object)\u65b9\u5f0f\u5b58\u653e\uff0c\u5f53MDS\u4e2d\u7684segments\u6570\u91cf\u8d85\u51famds_log_max_segments\u7684\u8bbe\u7f6e\u503c\uff08\u9ed8\u8ba4\u4e3a128\uff09\u65f6\uff0cMDS\u670d\u52a1\u5f00\u59cb\u542f\u52a8Trimming\uff0c\u5373\u5c06segments\u6570\u636e\u8fdb\u884c\u56de\u5199\u3002\u5f53MDS\u4e2d\u7684segments\u6570\u8d85\u8fc7\u8bbe\u5b9a\u503c\u4e24\u500d\u65f6\uff0c\u5f00\u59cb\u62a5\u8b66Behind on trimming\u4fe1\u606f\u3002\u5f53MDS\u670d\u52a1\u5668\u5185\u5b58\u8db3\u591f\u65f6\uff0c\u63a8\u8350\u589e\u5927mds_log_max_segments\u53c2\u6570\u503c\u3002<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>ceph config set mds mds_log_max_segments 1024<\/code><\/pre>\n\n\n\n<h2>9. mds N slow requests are blocked &gt; 30 secs<\/h2>\n\n\n\n<p>MDS\u670d\u52a1\u62a5\u8b66\uff1a<\/p>\n\n\n\n<pre class=\"wp-block-preformatted\"><strong>[WRN] MDS_SLOW_REQUEST: 3 MDSs report slow requests\nmds.cephfs.ceph109.avzzqn(mds.1): 29 slow requests are blocked &gt; 30 secs\nmds.cephfs.ceph110.sfagxf(mds.2): 1 slow requests are blocked &gt; 30 secs\nmds.cephfs.ceph106.hggsge(mds.0): 3 slow requests are blocked &gt; 30 secs<\/strong><\/pre>\n\n\n\n<p>\u4ee5\u4e0a\u62a5\u8b66\u8868\u793aMDS\u54cd\u5e94\u6162\uff0c\u539f\u56e0\u53ef\u80fd\u662f\uff1amds\u670d\u52a1\u8fd0\u884c\u592a\u6162\u3001\u5e95\u5c42pg\u6216OSD\u51fa\u95ee\u9898\u5bfc\u81f4\u5199\u5165\u65e5\u5fd7\u672a\u786e\u8ba4\u3001\u6216BUG\u3002\u901a\u8fc7\u8bbe\u7f6emds_op_complaint_time\u503c\u4e3a3000\uff0c\u95ee\u9898\u4f9d\u65e7\u3002<\/p>\n\n\n\n<p>\u51fa\u73b0\u6b64\u8b66\u544a\u65f6\uff0cOSD\u672a\u62a5\u9519\u3002\u800cmds\u670d\u52a1\u8fd0\u884c\u5e94\u8be5\u6b63\u5e38\uff0c\u5185\u5b58\u4e5f\u8db3\u591f\u7528\u3002\u901a\u8fc7\u9635\u5217\u5361\u68c0\u6d4b\u786c\u76d8\uff0c\u53d1\u73b0\u6709\u4e24\u53f0\u670d\u52a1\u5668\u5206\u522b\u6709\u4e00\u5757\u786c\u76d8\u6ca1\u6709\u68c0\u6d4b\u5230\u3002\u63a8\u6d4b\u662f\u76f8\u5e94\u7684\u786c\u76d8\u51fa\u95ee\u9898\uff0c\u800cOSD\u8fd8\u672a\u53cd\u5e94\u8fc7\u6765\uff0c\u5e26\u540e\u7eed\u89c2\u5bdf\u3002<\/p>\n\n\n\n<h2>10. insufficient standby MDS daemons available<\/h2>\n\n\n\n<p>\u5f53\u6709mds\u670d\u52a1crash\u7684\u65f6\u5019\uff0c\u5019\u9009\u7684mds\u5219\u8865\u4e0a\u3002\u6b64\u65f6\uff0c\u5df2\u7ecf\u8fde\u63a5\u4e0a\u7684\u8ba1\u7b97\u670d\u52a1\u5668\u8fd8\u662f\u53ef\u4ee5\u6b63\u5e38\u8bbf\u95eeceph\u5b58\u50a8\u3002\u4f46\u662f\uff0c\u65b0\u7684\u8ba1\u7b97\u670d\u52a1\u5668\u65e0\u6cd5\u6302\u8f7dceph\u6587\u4ef6\u7cfb\u7edf\u3002<\/p>\n\n\n\n<p>\u89e3\u51b3\u65b9\u6cd5\u662f\uff0cssh\u767b\u9646\u5230mds\u670d\u52a1\u6709crash\u7684\u670d\u52a1\u5668\uff0c\u7136\u540e\u91cd\u542f\u5176mds\u670d\u52a1\u3002\u518d\u767b\u9646\u5907\u7528\u7684mds\u670d\u52a1\u5668\uff0c\u91cd\u542f\u5176mds\u670d\u52a1\u3002<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>ssh ceph107\nsystemctl restart ceph-8f1c1f24-59b1-11eb-aeb6-f4b78d05bf17@mds.cephfs.ceph106.hggsge.service\nssh ceph102\nsystemctl restart ceph-8f1c1f24-59b1-11eb-aeb6-f4b78d05bf17@mds.cephfs.ceph102.imxzno.service\n<\/code><\/pre>\n\n\n\n<h2>11. OSD_TOO_MANY_REPAIRS: Too many repaired reads on 1 OSDs<\/h2>\n\n\n\n<p>\u5f53\u67d0\u4e2aOSD\u5bf9\u5e94\u7684\u78c1\u76d8\u6709\u574f\u9053\u65f6\uff0cCEPH\u7cfb\u7edf\u5bf9PGs\u8fdb\u884cdeep srub\u7684\u4e00\u81f4\u6027\u68c0\u6d4b\u65f6\uff0c\u4f1a\u68c0\u6d4b\u5230\u67d0\u4e2aOSD\u4e0a\u540c\u65f6\u53c8\u591a\u4e2aPGs\u51fa\u95ee\u9898\uff0c\u4e8e\u662f\u53ef\u80fd\u51fa\u73b0\u4ee5\u8be5\u8b66\u544a\u3002\u5373\u4f7f\u5bf9\u6240\u6709\u7684PGs\u8fdb\u884c\u4fee\u590d\u540e\uff0c\u8be5\u8b66\u62a5\u4e5f\u4e0d\u4f1a\u6d88\u5931\u3002<\/p>\n\n\n\n<pre class=\"wp-block-preformatted\">[root@ceph101 ~]# ceph health detail\nHEALTH_WARN Too many repaired reads on 1 OSDs\n[WRN] OSD_TOO_MANY_REPAIRS: Too many repaired reads on 1 OSDs\n    osd.174 had 21 reads repaired\n<\/pre>\n\n\n\n<p>\u82e5\u60f3\u8981\u89e3\u51b3\u8be5\u8b66\u62a5\uff0c\u8fdb\u5165\u5bf9\u5e94\u7684CEPH\u4e3b\u673a\uff0c\u91cd\u542f\u5bf9\u5e94\u7684OSD\u670d\u52a1\u5373\u53ef\u3002<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>ssh ceph105\nsystemctl restart ceph-osd@174.service\n<\/code><\/pre>\n\n\n\n<h2>12. RECENT_CRASH: 2 daemons have recently crashed<\/h2>\n\n\n\n<p>\u5f53CEPH\u6709daemons\u5954\u6e83\u60c5\u51b5\u65f6\uff0c\u867d\u7136\u89e3\u51b3\u95ee\u9898\u540e\uff0c\u4f46\u76f8\u5173crashed\u8b66\u544a\u4fe1\u606f\u4e0d\u4f1a\u6d88\u5931\u3002\u6bd4\u5982\u5f53\u6709\u78c1\u76d8\u51fa\u73b0\u574f\u9053\u5bfc\u81f4OSD\u670d\u52a1\u5954\u6e83\u81ea\u52a8\u91cd\u542f\uff0c\u867d\u7136CEPH\u7cfb\u7edf\u6ca1\u6709\u95ee\u9898\uff0c\u4f46\u7559\u4e0b\u4e86\u8b66\u544a\u4fe1\u606f\u3002<\/p>\n\n\n\n<pre class=\"wp-block-preformatted\">[root@ceph101 ~]# ceph health detail\nHEALTH_WARN 2 daemons have recently crashed\n[WRN] RECENT_CRASH: 2 daemons have recently crashed\n    osd.174 crashed on host ceph105 at 2022-09-02T22:09:29.443817Z\n    osd.174 crashed on host ceph105 at 2022-09-02T23:55:25.357799Z\n<\/pre>\n\n\n\n<p>\u4f7f\u7528\u5982\u4e0b\u547d\u4ee4\u6e05\u9664\u5e76\u5f52\u6863\u8b66\u544a\u4fe1\u606f\uff1a<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>ceph crash archive-all\n<\/code><\/pre>\n\n\n\n<h2>13. 1 clients failing to advance oldest client\/flush tid<\/h2>\n\n\n\n<p>\u6b63\u5e38\u60c5\u51b5\u4e0b\u5ba2\u6237\u7aef\u8bfb\u5199\u4efb\u52a1\u5b8c\u6210\u540e\uff0c\u5219\u901a\u77e5MDS\u670d\u52a1\u5668\u91ca\u653e\u8d44\u6e90\uff0c\u5e76\u66f4\u65b0tid\u3002\u82e5\u5ba2\u6237\u7aef\u548cMDS\u4e0d\u80fd\u6b63\u5e38\u6c9f\u901a\u4e86\uff0c\u53ef\u80fd\u5bfc\u81f4tid\u4e00\u76f4\u4e0d\u66f4\u65b0\uff0c\u4ece\u800c\u4f1a\u5bfc\u81f4MDS\u4e2d\u7684\u5185\u5b58\u4e0d\u80fd\u91ca\u653e\uff0c\u7ee7\u800c\u51fa\u73b0segment\u4e0d\u80fdtrim\u7684\u60c5\u51b5\uff0c\u540c\u65f6\u5bfc\u81f4 MDSs behind on trimming \u62a5\u8b66\u3002\u6b64\u65f6\uff0c\u5c06MDS\u670d\u52a1\u5668\u91cd\u542f\u5373\u53ef\u89e3\u51b3\u3002<\/p>\n\n\n\n<h2>14. daemon mds.ceph102,ceph104,ceph106.ceph102.tjbtez on ceph102 is in error state<\/h2>\n\n\n\n<p>\u6839\u636e\u63d0\u793a\u4fe1\u606f\uff0c\u8868\u793a\u6709\u4e00\u4e2amds\u670d\u52a1\u5904\u4e8e\u9519\u8bef\u72b6\u6001\u3002\u4ed4\u7ec6\u89c2\u5bdf\u5176\u540d\u79f0\uff0c\u5b83\u5305\u542b\u4e86\u591a\u4e2aceph\u670d\u52a1\u5668\u7684\u540d\u79f0\uff0c\u662f\u4e00\u4e2a\u4e0d\u6b63\u5e38\u7684\u540d\u79f0\u3002\u6b63\u5e38\u7684mds\u670d\u52a1\u5e94\u8be5\u53ea\u5305\u542b\u4e00\u4e2a\u76ee\u6807mds\u670d\u52a1\u5668\u7684\u540d\u79f0\u3002\u901a\u8fc7\u6267\u884cceph\u547d\u4ee4\u627e\u5230Status\u4e3aerror\u6216unkown\u7c7b\u578b\u7684mds\u670d\u52a1\uff0c\u7136\u540e\u5220\u9664\u8fd9\u4e9b\u9519\u8bef\u7684mds\u670d\u52a1\u5373\u53ef\u3002<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>&#91;root@ceph101 ~]#  ceph orch ps --daemon_type=mds\nmds.ceph102,ceph106,ceph108,ceph110.ceph107.wqiada  ceph107  error          41s ago    115s  &lt;unknown&gt;  docker.io\/ceph\/ceph:v15      &lt;unknown&gt;     &lt;unknown&gt;\nmds.cephfs.ceph102.imxzno                           ceph102  running (13m)  3m ago     4y    15.2.13    docker.io\/ceph\/ceph:v15      cc5b0b99041b  e2d2e396f2f4\n\n&#91;root@ceph101 ~]# ceph orch daemon rm mds.ceph102,ceph106,ceph108,ceph110.ceph107.wqiada<\/code><\/pre>\n\n\n\n<p>\u5220\u9664\u4e86\u5f02\u5e38\u7684mds\u540e\uff0cceph\u8868\u73b0\u6b63\u5e38\u3002\u4f46\u8fc7\u4e86\u4e00\u4f1a\u513f\uff0c\u7ee7\u7eed\u51fa\u73b0\u66f4\u591a\u76f8\u4f3c\u7684\u9519\u8befmds\u670d\u52a1\u3002\u6b64\u65f6\u8981\u68c0\u67e5\u90e8\u7f72\u8ba1\u5212\uff0c\u5220\u9664\u6709\u95ee\u9898\u7684\u90e8\u7f72\u8ba1\u5212\u3002<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>&#91;root@ceph101 ~]# ceph fs ls\r\nname: cephfs, metadata pool: cephfs_metadata, data pools: &#91;cephfs_data ]\r\n&#91;root@ceph101 ~]# ceph orch ls\r\nNAME                                 RUNNING  REFRESHED  AGE  PLACEMENT                                IMAGE NAME                            IMAGE ID\r\nalertmanager                             1\/1  5m ago     4y   count:1                                  docker.io\/prom\/alertmanager:v0.20.0   5eb21d2cb030\r\ncrash                                  10\/10  5m ago     4y   *                                        mix                                   mix\r\ngrafana                                  0\/1  -          -    count:1                                  &lt;unknown>                             &lt;unknown>\r\nmds.ceph102,ceph104,ceph106              0\/2  5m ago     4y   count:2                                  docker.io\/ceph\/ceph:v15               &lt;unknown>\r\nmds.ceph102,ceph106,ceph108,ceph110      0\/2  5m ago     4y   count:2                                  docker.io\/ceph\/ceph:v15               &lt;unknown>\r\nmds.cephfs                               4\/4  5m ago     9w   ceph102;ceph104;ceph106;ceph108          mix                                   mix\r\nmgr                                      2\/2  5m ago     4y   ceph101;ceph107                          docker.io\/ceph\/ceph:v15               mix\r\nmon                                      5\/5  5m ago     4y   ceph101;ceph103;ceph105;ceph107;ceph109  docker.io\/ceph\/ceph:v15               mix\r\nnode-exporter                          10\/10  5m ago     4y   *                                        docker.io\/prom\/node-exporter:v0.18.1  8e76a0ec7d90\r\nprometheus                               1\/1  5m ago     4y   count:1                                  docker.io\/prom\/prometheus:v2.18.1     a0ca3ee7950c\r\n&#91;root@ceph101 ~]# ceph orch rm mds.ceph102,ceph104,ceph106 --force\r\nRemoved service mds.ceph102,ceph104,ceph106\r\n&#91;root@ceph101 ~]# ceph orch rm mds.ceph102,ceph106,ceph108,ceph110 --force\r\nRemoved service mds.ceph102,ceph106,ceph108,ceph110\r\n<\/code><\/pre>\n","protected":false},"excerpt":{"rendered":"<p>1. Slow OSD heartbeats OSDs\u4e4b\u95f4\u4f1a\u76f8\u4e92\u6d4b\u8bd5\uff08ping\uff09 &hellip; <a href=\"http:\/\/www.chenlianfu.com\/?p=3388\">\u7ee7\u7eed\u9605\u8bfb <span class=\"meta-nav\">&rarr;<\/span><\/a><\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":[],"categories":[1],"tags":[],"_links":{"self":[{"href":"http:\/\/www.chenlianfu.com\/index.php?rest_route=\/wp\/v2\/posts\/3388"}],"collection":[{"href":"http:\/\/www.chenlianfu.com\/index.php?rest_route=\/wp\/v2\/posts"}],"about":[{"href":"http:\/\/www.chenlianfu.com\/index.php?rest_route=\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"http:\/\/www.chenlianfu.com\/index.php?rest_route=\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"http:\/\/www.chenlianfu.com\/index.php?rest_route=%2Fwp%2Fv2%2Fcomments&post=3388"}],"version-history":[{"count":34,"href":"http:\/\/www.chenlianfu.com\/index.php?rest_route=\/wp\/v2\/posts\/3388\/revisions"}],"predecessor-version":[{"id":3998,"href":"http:\/\/www.chenlianfu.com\/index.php?rest_route=\/wp\/v2\/posts\/3388\/revisions\/3998"}],"wp:attachment":[{"href":"http:\/\/www.chenlianfu.com\/index.php?rest_route=%2Fwp%2Fv2%2Fmedia&parent=3388"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"http:\/\/www.chenlianfu.com\/index.php?rest_route=%2Fwp%2Fv2%2Fcategories&post=3388"},{"taxonomy":"post_tag","embeddable":true,"href":"http:\/\/www.chenlianfu.com\/index.php?rest_route=%2Fwp%2Fv2%2Ftags&post=3388"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}