Monday, June 13, 2022

Recent Questions - Server Fault

Recent Questions - Server Fault


Assistance with WinNT MPM MaxRequestsPerChild

Posted: 13 Jun 2022 11:01 AM PDT

In our httpd-mpm.conf file, we have this section active:

# WinNT MPM  # ThreadsPerChild: constant number of worker threads in the server process  # MaxRequestsPerChild: maximum  number of requests a server process serves  <IfModule mpm_winnt_module>      ThreadsPerChild       128      MaxRequestsPerChild  1024  </IfModule>  

MaxRequestsPerChild was originally 0, meaning no limit, but we were getting hard memory allocation errors, so as per Apache documentation, we put a finite limit on this value. We're running /server-status to monitor, but I can't seem to correlate this value with anything that's showing up. Would like to confirm that the change is working, and whether it should be increased or decreased.

Excerpt from /server-status:

Srv PID Acc M SS Req Conn Child Slot Client VHost Request
0-36 59996 0/300/2456 _ 35 265 0.0 0.42 1.33 wks315.acme.local www.acme.com NULL
0-36 59996 2/180/2166 W 0 0 0.0 0.00 0.54 161.216.164.20 www.acme.com POST /loadMenu HTTP/1.1
0-36 59996 0/281/2426 _ 23 296 0.0 0.00 1.11 184.151.190.107 www.acme.com NULL
0-36 59996 0/9/1867 _ 15 390 0.0 0.00 1.40 192.168.5.41 www.acme.com NULL
0-36 59996 0/304/2294 _ 59 218 0.0 0.05 0.12 192.168.5.231 www.acme.com NULL
0-36 59996 4/274/2489 C 0 249 0.0 0.07 0.90 wks342 www.acme.com NULL

Legend:

Column Description
Srv Child Server number - generation
PID OS process ID
Acc Number of accesses this connection / this child / this slot
M Mode of operation
SS Seconds since beginning of most recent request
Req Milliseconds required to process most recent request
Conn Kilobytes transferred this connection
Child Megabytes transferred this child
Slot Total megabytes transferred this slot

I though at first it might be the last value in Acc "this slot", meaning that if this exceeds 1024, then the worker is restarted, but that's not the case. I've been monitoring the second value "this child", and it seems to peak around 320, never getting close to 1024. So I'm not sure what I should be looking at.

Proxy postgres connections adding client certificate authentication

Posted: 13 Jun 2022 10:54 AM PDT

In my environment, I would like to have a tool that takes in a user's certificates, and "adds" them to connections. The main reason to do this is to support client certificates as an authentication mechanism for tools that do not natively support it.

For web based connections, this is pretty easy. It's a javascript service that listens on a localhost port, and takes all requests that it gets and forwards them, adding the client cert information.

How could I do something similar for postgres? I know that it is possible to proxy postgres connections (pg_bouncer for example does this), but is there a tool that will allow me to do/build this? I am pretty flexible on language and how much work I have to do (other than rebuild all of postgres), and the actual list of features to support is pretty small:

  • The proxy only needs to support one cert at a time, and all queries should run as that cert
  • Does not need to be high performance
  • Assume trusted/localhost authentication between the client and the proxy
  • Only needs to support DQL statements, and even there, nothing fancy needed (to be honest, select support would be sufficient as well so long as it was over the pq protocol

Does there already exist a tool that can do this (or be modified to to it with a small amount of effort)? If not, what should I search for to make this as easy as possible?

Preserve original IP while forwarding

Posted: 13 Jun 2022 10:32 AM PDT

Im using firewalld to forward an incoming port from the internet (9999) to a local LAN IP address (100.1.1.1) like this:

external (active)    target: default    icmp-block-inversion: no    interfaces: tailscale0    sources:    services: ssh    ports: 9999/tcp    protocols:    forward: yes    masquerade: yes    forward-ports:    source-ports:    icmp-blocks:    rich rules:    public (active)    target: default    icmp-block-inversion: no    interfaces: enp0s3    sources:    services: dhcpv6-client ssh    ports: 9999/tcp    protocols:    forward: yes    masquerade: no    forward-ports:  port=9999:proto=tcp:toport=9999:toaddr=100.1.1.1    source-ports:    icmp-blocks:    rich rules  

The LAN IP (100.1.1.1) is from a Tailscale (VPN) interface running on the same machine, which delivers the traffic over the Internet to another machine.

The forwarding works fine, but my problem is that at the destination machine, all traffic appears to be coming from 100.1.1.1 (Tailscale) instead of the original source IP's. This is unhandy for things like fail2ban or statistics.

Is there a way to preserve the source address while still allowing the traffic to be forwarded?

EDIT: According to this article https://mghadam.blogspot.com/2020/05/forward-traffic-from-public-ip-to.html?m=1 it should be possible, but complicated.

RPi 4 MDADM Raid 5 problems

Posted: 13 Jun 2022 10:03 AM PDT

i bought myself a quad SATA Hat from Raxdata and mountet 4 x 1 TB Harddrives.

I tried to arrange them via MDADM in in an RAID5 setup (Debian), unfortunately after synching maybe 10% the operation stops and one or two of my drives are not "sdA" or "sdB" anymore but "sdE" and "sdF". I get no error messages, maybe you can help me.

Thank You, greetings

what is the use of "remote(client) port" for inbound firewall rule?

Posted: 13 Jun 2022 10:01 AM PDT

In firewall settings, local port for inbound rule is pretty obvious: that is the port you want to listen. However, remote port sounds nonsense: In typical protocol, client uses arbitrary port so restricting remote port will break your service.

https://i.stack.imgur.com/MdHzW.png image is borrowed from What are differnet between local port and remote port of firewall in Windows 2016 server? Although the image is windows firewall settings, I guess other firewalls have similar.

Is there any case to restrict client port(remote port) for inbound traffic?

FreeIPA migrate the current NFSv4 storing home directories to another server

Posted: 13 Jun 2022 09:18 AM PDT

I have a FreeIPA set-up that uses NFSv4 to store users' home directories. NFS is running on the same physical server as the FreeIPA. CentOS btw. I'd like to move the NFS server on a new machine and add more storage.

I have searched for documentation but there's no guide on how to perform this. Not much information on the internet either. Mb it's too trivial of a task and no one bothers to ask XD.

Clients get kerberos tickets to access their files.

If anyone has already done it or has an idea, could you please give me the steps to follow or things to try.

I thought of just copying data and spinning up a new NFS server on another machine but it's not obvious what I should update in FreeIPA config so that it plays together nicely.

Feel free to ask for additional information. Thanks!

Use ip6tables to forward traffic to setup DNAT/SNAT to link local address

Posted: 13 Jun 2022 09:17 AM PDT

I'm trying to use a raspberry to connect a terminal device to a VLAN. Basically I need to reach a device (that I cannot directly connect to a VLAN) remotely.

My idea is to connect the device (via eth) to a raspberry, join the raspberry to a VLAN and then proxy all the traffic between the VLAN and the device. I'm interested in proxy all and only ipv6 connections (tcp and udp).

Network configuration is:

( Device )                     ( Raspberry )                  ( Laptop that need access )  DEVICE_IP <-----eth0-----> RASP_IP  RASP_VLANIP <----- VLAN ham0 ------> PC_VLANIP  

I have set up the VLAN between the raspberry and the laptop using Hamachi. Then I have setup the following iptables rules:

ipt6 --in-interface ham0 --append PREROUTING --table nat --destination $RASP_VLANIP --jump DNAT --to-destination $DEVICE_IP  ipt6 --append POSTROUTING --table nat --destination $RASP_IP --jump SNAT --to-source $PC_VLANIP  

I have then used iperf3 to test connectivity launching on the device and then trying to connect to it from the laptop using the address RASP_VLANIP. Anyway i get the error: iperf3: error - unable to connect to server: Connection refused

What am I doing wrong?

Additional info:

net.ipv4.ip_forward = 0  net.ipv6.conf.all.forwarding = 1  net.ipv6.conf.eth0.forwarding = 1  net.ipv6.conf.wlan0.forwarding = 1  net.ipv6.conf.ham0.forwarding = 1    pi@raspi:~ $ ip -6 neigh show  fe80::213d:e705:e749:7a8f dev eth0 lladdr --:--:--:--:-- STALE   fe80::f079:e6fa:2c56:4984 dev eth0 lladdr --:--:--:--:-- STALE <--DEVICE_IP  2620:9b::1946:6064 dev ham0 lladdr --:--:--:--:-- STALE        <--PC_VLANIP    pi@raspi:~ $ ifconfig      eth0:           inet6 fe80::213d:e705:e749:7a8f                        <--RASP_IP      ham0:           inet6 fe80::7879:19ff:fe22:e039                        <--RASP_VLANIP      wlan0:           inet6 fe80::8cff:42dc:7fba:3289    

How to read and decode fio --bandwidth-log?

Posted: 13 Jun 2022 09:18 AM PDT

I'm Looking for assistance with reading and decoding fio --bandwidth-log.

I've run the below command and the output includes a few columns as listed below, how to read and decode each column?

fio --invalidate=1 --filename=/dev/nvme0n1 --direct=1 --ioengine=libaio --iodepth=32 --time_based --runtime=3600 --bandwidth-log --name=/dev/nvme0n1 --rw=randread --bs=4k --log_avg_msec=1000

Output example (first few lines):

501, 334730, 0, 0, 0

1177, 647294, 0, 0, 0

1678, 985860, 0, 0, 0

2180, 948023, 0, 0, 0

2681, 967369, 0, 0, 0

3182, 977405, 0, 0, 0

3683, 982035, 0, 0, 0

  1. How to read the 1st column? No matter what period I provide for --timebased I get 1024 results in total.
  2. The 2nd column doesn't fit IOPS, nor BW in MB/s. I read somewhere that it is in KB/s and tried conversion which provides reasonable MB/s in some cases but doesn't mix RW commands.

If I read the fio man page it is only mention the below with no explanation:

--bandwidth-log

Generate aggregate bandwidth logs.

GCP: How to access a L2 VM (qemu) running in a pod in gcp by IP from internet?

Posted: 13 Jun 2022 09:20 AM PDT

I have a cluster of 2 nodes created in gcp. The worker node (L1 VM) has nested virtualization enabled. I have created a pod in this L1 VM. and I have launched a L2 VM using qemu in this pod. My objective is to access this L2 VM only by an IP address from an external word (internet). There are many services running in my VM (L2 VM) and I need to access it only by IP. I created a tunnel from node to L2 VM (which is within the pod) to get the dhcp address to my VM. but it seems dhcp offer and ack messages are blocked by google cloud. So L2VM did not get a public IP. I have got the following link from google cloud.google.com/compute/docs/instances/nested-virtualization/… It has described how to access L2 VM from outside L1 VM. It has used the alias-ip of the primary interface of the vm instance. This alias-ip is nat-ed with the IP address of L2 VM. So all packets destined to alias-ip reach L2VM. Here the only condition is that packets have to come to alias-ip. I have tried this and it works.

This technique even works if I try to connect to the L2 VM from the internet. In this case I send packets destined to the public IP of L1 VM and gcp automatically does ONE_TO_ONE_NAT between public ip and internal ip of the primary interface of L1 VM. But there is a problem while trying to connect from the internet. gcp claims it as a "man in the middle" attack. They say it correctly because I nat-ed the primary internal IP of L1 VM to L2 VM IP.

My objective is to nat the public IP of L1 VM to alias-IP of L1 VM primary interface so that I can further do nat-ing alias-ip to L2 VM IP. Is it possible to do nat-ing between public ip of L1 VM and alias-ip of L1 VM? Or is there another way to propagate my packet destined to the public ip of L1 VM to L2 VM?

Pod assigned node role instead of service account role on AWS EKS

Posted: 13 Jun 2022 09:00 AM PDT

First some info about the setup:

EKS version: 1.21
eksctl version: 0.77.0
AWS Go SDK verion: v1.44.28
Deploying using kubectl

I have a k8s cluster on AWS EKS on which I am deploying a custom k8s controller for my application. Using instructions from eksworkshop.com, I created my service account with the appropriate IAM role using eksctl. I assign the role in my deployment.yaml as seen below. I also set the securityContext as that seemed to solve problem in some cases as described here.

apiVersion: apps/v1  kind: Deployment  metadata:    name: tel-controller    namespace: tel  spec:    replicas: 2    selector:      matchLabels:        app: tel-controller    strategy:      rollingUpdate:        maxSurge: 50%        maxUnavailable: 50%      type: RollingUpdate    template:      metadata:        labels:          app: tel-controller      spec:        serviceAccountName: tel-controller-serviceaccount        securityContext:          fsGroup: 65534        containers:        - image: <image name>          imagePullPolicy: Always          name: tel-controller          args:          - --metrics-bind-address=:8080          - --health-probe-bind-address=:8081          - --leader-elect=true          ports:            - name: webhook-server              containerPort: 9443              protocol: TCP            - name: metrics-port              containerPort: 8080              protocol: TCP            - name: health-port              containerPort: 8081              protocol: TCP          securityContext:            readOnlyRootFilesystem: true            runAsNonRoot: true            allowPrivilegeEscalation: false  

But this does not seem to be working. If I describe the pod, I see the correct role.

AWS_DEFAULT_REGION:           us-east-1  AWS_REGION:                   us-east-1  AWS_ROLE_ARN:                 arn:aws:iam::xxxxxxxxx:role/eksctl-eks-tel-addon-iamserviceaccount-tel-t-Role1-3APV5KCV33U8  AWS_WEB_IDENTITY_TOKEN_FILE:  /var/run/secrets/eks.amazonaws.com/serviceaccount/token  Mounts:    /var/run/secrets/eks.amazonaws.com/serviceaccount from aws-iam-token (ro)    /var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-6ngsr (ro)          

But if I do a sts.GetCallerIdentityInput() from inside the controller application, I see the node role. And obviously i get an access denied error.

caller identity: (go string) { Account: "xxxxxxxxxxxx", Arn: "arn:aws:sts::xxxxxxxxxxx:assumed-role/eksctl-eks-tel-nodegroup-voice-NodeInstanceRole-BJNYF5YC2CE3/i-0694a2766c5d70901", UserId: "AROAZUYK7F2GRLKRGGNXZ:i-0694a2766c5d70901" }

This is how I created by service account

eksctl create iamserviceaccount --cluster ${EKS_CLUSTER_NAME} \  --namespace tel \  --name tel-controller-serviceaccount \  --attach-policy-arn arn:aws:iam::xxxxxxxxxx:policy/telcontrollerRoute53Policy \  --override-existing-serviceaccounts --approve  

I have done this successfully in the past. The difference this time is that I also have role & role bindings attached to this service account. My rbac.yaml for this SA.

apiVersion: rbac.authorization.k8s.io/v1  kind: ClusterRole  metadata:    name: tel-controller-role    labels:      app: tel-controller  rules:  - apiGroups: [""]    resources: [events]    verbs: [create, delete, get, list, update, watch]  - apiGroups: ["networking.k8s.io"]    resources: [ingressclasses]    verbs: [get, list]  - apiGroups: ["", "networking.k8s.io"]    resources: [services, ingresses]    verbs: [create, get, list, patch, update, delete, watch]  - apiGroups: [""]    resources: [configmaps]    verbs: [create, delete, get, update]  - apiGroups: ["coordination.k8s.io"]    resources: ["leases"]    verbs: [get, create, update]  - apiGroups: [""]    resources: [pods]    verbs: [get, list, watch, update]  - apiGroups: ["", "networking.k8s.io"]    resources: [services/status, ingresses/status]    verbs: [update, patch]  ---  apiVersion: rbac.authorization.k8s.io/v1  kind: ClusterRoleBinding  metadata:    name: tel-controller-rolebinding    labels:      app: tel-controller  roleRef:    apiGroup: rbac.authorization.k8s.io    kind: ClusterRole    name: tel-controller-role  subjects:  - kind: ServiceAccount    name: tel-controller-serviceaccount    namespace: tel  

What am I doing wrong here? Thanks.

Redirecting traffic via Wireguard VPN

Posted: 13 Jun 2022 11:17 AM PDT

I have a public IPv6 address but not an IPv4. Therefore I want to route the traffic via a VPS with a public IPv4 and an IPv6 address. My question is how to create this type of tunnel with Wireguard. The tunnel from the VPS to a device in my network is not the challenge, but rather how to redirect the packets on the server to that tunnel.

I've done a bit of research and my approach would look like this.

My Network device

[Interface]  Address = <DEVICE IPv6>  PrivateKey = <private key>  ListenPort = <DEVICE PORT>    # Peer to VPS  [Peer]  PublicKey = [PUBLIC KEY VPS]  AllowedIPs = [VPS IPv6]  Endpoint = [VPS IPv6]:[VPS PORT]  

VPS

[Interface]  Address = <VPS IPv6>  Address = <VPS IPv4>  PrivateKey = <private key>  ListenPort = <VPS PORT>    # Peer to device  [Peer]  PublicKey = [PUBLIC KEY DEVICE]  Endpoint = [DEVICE IPv6]:[DEVICE PORT]  AllowedIPs = 0.0.0.0/0, ::/0      # Example peer of client  [Peer]  PublicKey = <client public key>  AllowedIPs = 0.0.0.0/0, ::/0  

Example Client

[Interface]  PrivateKey = <private key>  ListenPort = <CLIENT PORT>    [Peer]  PublicKey = [PUBLIC KEY VPS]  Endpoint = [VPS IPv4]:[VPS PORT], [VPS IPv6]:[VPS PORT]  AllowedIPs = 0.0.0.0/0  

Is this possible? Or do I need to create two WG interfaces and route the traffic between?

Why would my WordPress website be so slow?

Posted: 13 Jun 2022 10:23 AM PDT

Goal:

Make WordPress webpage load faster on local network. Currently pages load fine but take about 4-5 seconds to load. I want to cut that down to half the time or smaller.


I have a two new VMs dedicated to this new website I'm testing (ON LOCAL NETWORK). One VM is for the SQL server, and the second VM is for the webserver.

SQL Server Setup: (less then 50% mem/cpu max)

  • OS: Windows Server 2019
  • MySQL: Ver 8.0.29 for Win64 on x86_64 (MySQL Community Server - GPL)

Web Server Setup: (less then 50% mem/cpu max)

  • OS: Windows Server 2019
  • IIS: Ver 10.0.17763.1
  • IIS Compression module enabled
  • IIS WinCache module enabled
  • IIS Output Caching enabled
  • PHP: Ver 7.4.29 (cli) (built: Apr 14 2022 16:24:02) ( NTS Visual C++ 2017 x64 )

WordPress Setup:

  • Ver 5.9.3
  • NO plugins enabled

I currently do not have an SSL cert set up on the server because I want to get these performance issues resolved first. Was not sure if that would be a factor but thought it wise to mention.

The site loads fine but it just takes forever. Is there any things that can be checked to see why the site is loading so slow? Could it be related to mySQL somehow? It seems that from the default IIS webpage when I put a phpinfo.php page that loads really fast. Any ideas of tests I could run to troubleshoot slow load times?

Note: I know there are a lot of server admins out there that will knee jerk response with "Don't use Windows it's bad.". I'm not looking for that kind of "help" here. I know it runs fine on Windows using IIS. I have seen plenty of webpages with users saying they have no issues running the sites on Windows with IIS.

Thanks for any suggestions you can provide with a solution or debug help!

Choose all labels in a Grafana time series chart in one go instead of "Ctrl+click"-picking each label until the full list is marked?

Posted: 13 Jun 2022 10:10 AM PDT

In a time-series chart in Grafana, I try to mark a bunch of labels so that all of their curves are shown.

As a default, I get only four labels' curves in the graph, but I have dozens of labels and I do not want to mark everything with the mouse, it takes too much time and nerves.

The idea is likely that choosing too many curves will leave you lost in the lines. But in this case, the graph is about finding outliers, strong changes and trends, or just high numbers. You can hover over any curve that might catch your eye, and that is all. Thus, having 80 curves in one graph is no problem.

The filter is just about shrinking the list, not about marking all labels in it. I can use it to Regex-check for queries with 2-digit seconds duration and some other filter on the query_name.

enter image description here

Yet, I just want to see all labels' curves in one go, and not just by clicking like in the following:

enter image description here

Is there a trick to get this done? Perhaps even by using the Grafana Dashboard code to mark the jobs as a hardcoded list? Or is there a shortcut or other trick to pick all?

sar service has stopped collecting data

Posted: 13 Jun 2022 10:43 AM PDT

We had sar working on our Ubuntu server, but did some work on the server and now it's stopped logging to the day's logfile.

sar -b 5 5  

This indicates that sar is alive, and monitoring data, but

ubuntu@testing:/var/log/sysstat$ sar  

outputs:

Linux 5.4.0-1063-azure (server)     02/22/22    _x86_64_    (4 CPU)    10:22:05     LINUX RESTART  (4 CPU)    10:22:46     LINUX RESTART  (4 CPU)    10:24:25     LINUX RESTART  (4 CPU)    16:34:04     LINUX RESTART  (4 CPU)  

The cron and the sysstat config have not changed.

*/1 * * * * root command -v debian-sa1 > /dev/null && debian-sa1 1 1  

Why isn't stats data getting added to the log?

Cloudfront caching resources even though response headers should prevent it

Posted: 13 Jun 2022 08:09 AM PDT

I have recently setup a Cloudfront distribution with the following behaviour cache policy:

  • TTL settings:
    • Minimum TTL (seconds): 0
    • Maximum TTL (seconds): 31536000
    • Default TTL (seconds): 0
  • Cache Keys:
    • Headers - None
    • Cookies - None
    • Query strings - All

Unfortunately pages with no-cache response headers continue to cache the response at fairly low levels of concurrency. I used apachebench to run 100 requests with concurrency of 5, and received the following:

    100 Cache-Control: no-cache, no-store, must-revalidate, max-age=0       25 X-Cache: Hit from cloudfront       75 X-Cache: Miss from cloudfront  

I also captured what should be unique response headers that should be unique per request/response (given there are no request headers/cookies) and this also shows that there are duplicate Set-Cookie responses. For example, this response came back 4 times:

      4 Set-Cookie: csrftoken=h2uU7TKHJ6AicHgOIaJTwC5qIXJN4Zwf; Domain=.mysite.com; expires=Tue, 17-Jan-2023 15:10:37 GMT; Max-Age=31449600; Path=/  

I do have ways around this I believe, such as higher priority Cloudfront behaviours to set a no-cache policy, however it takes the power away from the server-side to decide whether a response should dynamically be cached, and indicates that Cloudfront is not honouring the server-side decision.

How I can send all logs from journald to GCP Logging?

Posted: 13 Jun 2022 08:08 AM PDT

Don't know what to add more to question, just want to send all logs. My applications write logs to journald, there is no files on disk.

UPD. Just to clarify, there are files where journald store logs, my application do not create any logs files.

Can We Use Openstack Neutron as standalone component?

Posted: 13 Jun 2022 09:02 AM PDT

Scenario: We are trying to develop our Virtual Network using Open Virtual Network or OVN, We need our Virtual Network Switches running firstly on the root Virtual Router, and there are some other Virtual Switches as well. I was Working on the openvswitch and OVN but a lot of things are missing for me, and there are some unlinked dots between the deployment itself and there are some problems. My Question has 3 further Parts.

  1. Is it possible at all to use Neutron as our Network Manager without OpenStack cloud installation?
  2. Is it a good approach?
  3. if not then are there any cookbooks available for OVN, I am not talking about man-pages.

Any help would be highly appreciated.

P.S. I am thinking about this because OVN got merged with Neutron, and it could be one of the possibilities.

AWS - VPC creation date

Posted: 13 Jun 2022 10:34 AM PDT

Could you possibly let me know how I can check when the VPC has been created? Or how to check in cloudtrail who created vpc via cli?

I've tried to use cloudtrail and search in event name for CreateVpc but I was not able to find anything.

How to remove corrupt RPM install

Posted: 13 Jun 2022 09:18 AM PDT

I tried to install an RPM package and the install process failed. It looks like the program needs signing the Kernel Modules or something. Now I'm stuck in a weird state where rpm says the package is installed, but when I try to uninstall it, it claims it's not installed.

sudo rpm -i mypackage.rpm      package mypackage is already installed    sudo rpm -e mypackage.rpm  error: package mypackage is not installed  

How can I resolve the install/uninstall state? I'd like to remove the package.

Install a .reg file via GPO

Posted: 13 Jun 2022 10:55 AM PDT

I have downloaded a .reg file with some registry keys I'd like to apply on a Windows machine. Since the same key need to be applied, I'd like to do it directly with GPO policies.

I found several guides, however no one explicitly states a way to push directly a .reg file content.

Could you please explain me a clean way to do it?

Amazon EFS hangs when attempting to list files inside

Posted: 13 Jun 2022 11:08 AM PDT

When doing an ls inside an Amazon EFS mount point, it just hangs.

The EFS troubleshooting section on AWS EFS troubleshooting

Mentions the following:

Mount Does Not Respond

An Amazon EFS mount appears unresponsive. For example, commands like ls hang.

Action to Take

This error can occur if another application is writing large amounts of data to the file system. Access to the files that are being written might be blocked until the operation is complete. In general, any commands or applications that attempt to access files that are being written to might appear to hang. For example, the ls command might hang when it gets to the file that is being written. This is because some Linux distributions alias the ls command so that it retrieves file attributes in addition to listing the directory contents.

To resolve this issue, verify that another application is writing files to the Amazon EFS mount, and that it is in the Uninterruptible sleep (D) state, as in the following example:

$ ps aux | grep large_io.py

root 33253 0.5 0.0 126652 5020 pts/3 D+ 18:22 0:00 python large_io.py /efs/large_file

After you've verified that this is the case, you can address the issue by waiting for the other write operation to complete, or by implementing a workaround. In the example of ls, you can use the /bin/ls command directly, instead of an alias, which will allow the command to proceed without hanging on the file being written. In general, if the application writing the data can force a data flush periodically, perhaps by using fsync(2), this might help improve the responsiveness of your file system for other applications. However, this improvement might be at the expense of performance when the application writes data.

So I verified to see if anything was writing to it but the only thing that showed up was

root 43556 0.0 0.0 124356 756 pts/6 D+ 19:15 0:00 ls --color=auto /efs/

root 43558 0.0 0.0 112664 972 pts/3 S+ 19:16 0:00 grep --color=auto efs

So nothing is being written to EFS as far as I know. Are there any other things I can look into as causes of this?

I also tried mounting the EFS on a separate machine just to verify, I also tested another machine in a different AZ to the other mount point in that AZ and saw the same behavior.

update:

lsof shows:

nfsv4.1-s 113422 root cwd DIR 202,1 4096 128 /

nfsv4.1-s 113422 root rtd DIR 202,1 4096 128 /

nfsv4.1-s 113422 txt cwd unknown /proc/113422/exe

This disappears when unmounted, and reappears after mounting.

How to identify and fix Missing blocks reported by Ambari for the NameNode?

Posted: 13 Jun 2022 11:08 AM PDT

Ambari is generating an alert NameNode Blocks Health: Total Blocks:[38252543], Missing Blocks:[2]. No further information.

I've run hdfs fsck / which is reporting the entire filesystem as healthy. I've run hdfs dfsadmin -report which reports that there are two missing blocks, but does not give details.

How do I find these missing blocks and thence fix them?

Virtnetwork Cannot Start Virtualizor KVM

Posted: 13 Jun 2022 09:06 AM PDT

I have a problem with my virtnetwork. I have set the correct network interface on master setting Virtualizor but it said

/etc/sysconfig/: error fetching interface information: Device not found Error: No ip address found.

when I try to run

service virtnetwork start  

Does anyone can help me? Here is my network interface "ifcfg-ens9"

NAME="ens9"  DEVICE="ens9"  ONBOOT=yes  NETBOOT=yes  UUID="805c90c6-a8d2-49f1-8707-44696466a9fa"  IPV6INIT=yes  BOOTPROTO=none  TYPE=Ethernet  DNS1=127.0.0.1  DEFROUTE=yes  IPV4_FAILURE_FATAL=no  IPV6_AUTOCONF=yes  IPV6_DEFROUTE=yes  IPV6_FAILURE_FATAL=no  IPADDR=xxx.xxx.187.234  PREFIX=29  GATEWAY=xxx.xxx.187.233  IPV6_PEERDNS=yes  IPV6_PEERROUTES=yes  

Looking forward for the solution. I don't know whats wrong the setting above.

Passenger could not spawn process for application. Rails, ubuntu, passenger and apache

Posted: 13 Jun 2022 09:06 AM PDT

My application is crashing again and again. I am using the hosting services of digital ocean the status of the server are also displayed in the image (It start working when I restart the server but that is not the solution. After sometime it again crashes.) Apache log.error shows:

[ 2016-07-05 16:40:39.2615 1768/7ff87c09b700 age/Cor/App/Implementation.cpp:304 ]: Could not spawn process for application /home/abc/kbs: An er$    Error ID: b1aaad3d    Error details saved to: /tmp/passenger-error-ygdDZS.html    Message from application: An error occurred while starting the web application. It exited before signalling successful startup back to Phusion$  <h2>Raw process output:</h2>  (empty)    App 21534 stdout:   ^[[1m^[[35mCategoryProduct Load (23.0ms)^[[0m  SELECT "category_products".* FROM "category_products" WHERE "category_product$  App 21534 stdout:   ^[[1m^[[36m (0.5ms)^[[0m  ^[[1mSELECT "category_products"."name" FROM "category_products" WHERE "category_products"."product$  terminate called after throwing an instance of 'Passenger::SystemException'  App 21534 stdout:   ^[[1m^[[35mCACHE (0.3ms)^[[0m  SELECT  "countries".* FROM "countries" WHERE "countries"."id" = $1  ORDER BY name LIMIT 1  [[$    what():  Cannot fork a new process: Cannot allocate memory (errno=12)  ERROR: cannot fork a process for executing 'tee'  [ pid=1768, timestamp=1467751239 ] Process aborted! signo=SIGABRT(6), reason=SI_TKILL, signal sent by PID 1768 with UID 0, si_addr=0x6e8, random$  [ pid=1768 ] Could not create crash log file, so dumping to stderr only.  [ pid=1768 ] Could fork a child process for dumping diagnostics: fork() failed with errno=12  

Content of the file /tmp/passenger-error-ygdDZS.html are below.

An error occurred while starting the web application. It exited before signalling successful startup back to Phusion Passenger. Please read this article for more information about this problem.  Raw process output:    (empty)  Error ID  b1aaad3d  Application root  /home/abc_user/app_name  Environment (value of RAILS_ENV, RACK_ENV, WSGI_ENV, NODE_ENV and PASSENGER_APP_ENV)  production  Ruby interpreter command  /usr/bin/passenger_free_ruby  User and groups  uid=1000(abc_user) gid=1000(abc_user) groups=1000(abc_user),27(sudo)  Environment variables  APACHE_PID_FILE=/var/run/apache2/apache2.pid  SHELL=/bin/bash  APACHE_RUN_USER=www-data  PASSENGER_DEBUG_DIR=/tmp/passenger.spawn-debug.XXXXijaXyT  USER=abc_user  PASSENGER_USE_FEEDBACK_FD=true  APACHE_LOG_DIR=/var/log/apache2  PATH=/home/abc_user/.rbenv/plugins/ruby-build/bin:/home/abc_user/.rbenv/shims:/home/abc_user/.rbenv/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin  PWD=/home/abc_user/app_name  APACHE_RUN_GROUP=www-data  LANG=C  NODE_PATH=/usr/share/passenger/node  RBENV_SHELL=bash  SHLVL=0  HOME=/home/abc_user  LOGNAME=abc_user  SERVER_SOFTWARE=Apache/2.4.7 (Ubuntu) Phusion_Passenger/5.0.26  APACHE_LOCK_DIR=/var/lock/apache2  APACHE_RUN_DIR=/var/run/apache2  IN_PASSENGER=1  PYTHONUNBUFFERED=1  RAILS_ENV=production  RACK_ENV=production  WSGI_ENV=production  NODE_ENV=production  PASSENGER_APP_ENV=production  SCRIPT_URL=/products/spray-paints-hand-tools  SCRIPT_URI=https://www.domain.com/products/name_of_brand  HTTPS=on  SSL_TLS_SNI=www.domain.com  Ulimits  Unknown  System metrics  ------------- General -------------  Kernel version    : 3.13.0-79-generic  Uptime            : 8h 41m 28s  Load averages     : 4.03%, 2.24%, 0.96%  Fork rate         : unknown    ------------- CPU -------------  Number of CPUs    :    2  Average CPU usage : 100%  -- 100% user,   0% nice,   0% system,   0% idle    CPU 1           : 100%  -- 100% user,   0% nice,   0% system,   0% idle    CPU 2           : 100%  -- 100% user,   0% nice,   0% system,   0% idle  I/O pressure      :   0%    CPU 1           :   0%    CPU 2           :   0%  Interference from other VMs:   0%    CPU 1                    :   0%    CPU 2                    :   0%    ------------- Memory -------------  RAM total         :   2001 MB  RAM used          :   1879 MB (94%)  RAM free          :    122 MB  Swap total        :      0 MB  Swap used         :      0 MB (-nan%)  Swap free         :      0 MB  Swap in           : unknown  Swap out          : unknown  

Some information about my server is

Server version: Apache/2.4.7 (Ubuntu)

Server built: Jan 14 2016 17:45:23

Phusion Passenger 5.0.26

Memory(Ram) 2GB Disc Space 40Gb

AWS connection error: Permission denied (publickey)

Posted: 13 Jun 2022 10:05 AM PDT

Sorry if this sounds redundant to you but trust me its not. I have tried almost majority of the links related to this problem but nothing is working for me so far. I even tried this article two. Below is what I have tried so far

  1. Permission of the keys 400 as well 600
  2. ubuntu as the username because its the Ubuntu 14.04
  3. IP is correct and I even tried public dns as well
  4. Key is attached to the instance
  5. AWS Java client (MindTerm) using FireFox browser. But it gives error after i press enter when it shows me this line against my IP. Even if I get lucky it would just ask username which i give ubuntu and then it exits giving error "I/O error - read failed: unknown error" or either just take me back to the IP step.

MindTerm home: /home/waqas/.mindterm/  SSH Server/Alias: 54.191.37.141  Connected to server running SSH-2.0-OpenSSH_6.6.1p1 Ubuntu-2ubuntu2    Server's hostkey (ssh-rsa) fingerprint:  openssh md5:  95:44:f1:40:07:90:00:2a:7d:9a:1f:49:a1:71:8a:0b  bubblebabble: xilon-segen-tufep-manir-rekad-lucag-fetoz-sover-hyhuh-kafiz-kixox  

The last thing I did before this issue was that 2 days ago I was trying to install the FTP server on my client using this link http://www.krizna.com/ubuntu/setup-ftp-server-on-ubuntu-14-04-vsftpd/. Unfortunately this link didnt work as expected and I ended up with no success in FTP logins. And today when I tried to login using my keypair its giving me error.

Below is the log for my ssh attempt

waqas@waqas-itu:~/Downloads/key$ ssh -v -i test.pem ubuntu@54.191.37.141  OpenSSH_6.0p1 Debian-3ubuntu1.2, OpenSSL 1.0.1c 10 May 2012  debug1: Reading configuration data /etc/ssh/ssh_config  debug1: /etc/ssh/ssh_config line 19: Applying options for *  debug1: Connecting to 54.191.37.141 [54.191.37.141] port 22.  debug1: Connection established.  debug1: identity file test.pem type -1  debug1: identity file test.pem-cert type -1  debug1: Remote protocol version 2.0, remote software version OpenSSH_6.6.1p1 Ubuntu-2ubuntu2  debug1: match: OpenSSH_6.6.1p1 Ubuntu-2ubuntu2 pat OpenSSH*  debug1: Enabling compatibility mode for protocol 2.0  debug1: Local version string SSH-2.0-OpenSSH_6.0p1 Debian-3ubuntu1.2  debug1: SSH2_MSG_KEXINIT sent  debug1: SSH2_MSG_KEXINIT received  debug1: kex: server->client aes128-ctr hmac-md5 none  debug1: kex: client->server aes128-ctr hmac-md5 none  debug1: sending SSH2_MSG_KEX_ECDH_INIT  debug1: expecting SSH2_MSG_KEX_ECDH_REPLY  debug1: Server host key: ECDSA 80:dd:8f:50:a3:80:81:00:39:06:e4:05:6e:f3:1f:16  debug1: Host '54.191.37.141' is known and matches the ECDSA host key.  debug1: Found key in /home/waqas/.ssh/known_hosts:108  debug1: ssh_ecdsa_verify: signature correct  debug1: SSH2_MSG_NEWKEYS sent  debug1: expecting SSH2_MSG_NEWKEYS  debug1: SSH2_MSG_NEWKEYS received  debug1: Roaming not allowed by server  debug1: SSH2_MSG_SERVICE_REQUEST sent  debug1: SSH2_MSG_SERVICE_ACCEPT received  debug1: Authentications that can continue: publickey  debug1: Next authentication method: publickey  debug1: Offering RSA public key: testserverpem.pem  debug1: Authentications that can continue: publickey  debug1: Offering RSA public key: waqas.jamal@***.com  debug1: Authentications that can continue: publickey  debug1: Trying private key: test.pem  debug1: read PEM private key done: type RSA  debug1: Authentications that can continue: publickey  debug1: No more authentication methods to try.  Permission denied (publickey).  

auditd auid changes after su

Posted: 13 Jun 2022 10:05 AM PDT

I try to implement individual accountability for my RHEL systems using selinux and the audit.log. I followed the instructions given here: Log all commands run by admins on production servers

If I understand it correctly, the pam_loginuid.so should keep the UID which was used to login and set it as the AUID in the audit.log file. Unfortunately that does not work after su. When I login to the system and call cat /proc/self/loginuid it displays my correct UID. If I invoke sudo su - and call cat /proc/self/loginuid again, it displays 0. Also the ID 0 is used in the audit.log as AUID for commands I invoke after sudo su -.

What am I doing wrong here?

Here is my pam.d/sshd file:

auth       include      system-auth  account    required     pam_nologin.so  account    include      system-auth  password   include      system-auth  session    optional     pam_keyinit.so revoke  session    required     pam_loginuid.so  session    include      system-auth  

I enabled audit=1 in /etc/grub.conf and edited /etc/audit/audit.rules as described in the post above.

Where are credentials for SQL Management Studio saved?

Posted: 13 Jun 2022 09:30 AM PDT

When we logged into SQL Management Studio(using Server Name, Login and Password) with checked "Remember Password". I need to know, where it save in PC.

I need to format my PC. And when we install SQL Management Studio, then I will lose my all credentials which I saved. That's why I need to get that files for backup where it save.

Fastest way to extract tar.gz

Posted: 13 Jun 2022 09:20 AM PDT

Is there anyway to extract a tar.gz file faster than tar -zxvf filenamehere?

We have large files, and trying to optimize the operation.

No comments:

Post a Comment