Recent Questions - Stack Overflow

How " numpy.ndarray() " works?
How the CoreCLR resolves project dependencies?
What are the risks of variables with protected visibility
I'm unable to catch errors from an async functio
How to Enter the Size of Array in main without knowing beforehand?
How to selecet specific Midi Sequencer in Java?
why does SIGTERM handling mess up pool.map() handling?
Pandas skipping blank rows
xamarin.forms: AppTrackingTransparency no longer found in PCL only in ios project [duplicate]
Error in for-loop for simulating data with negative difference scores as rows
How to create a Binary Tree from a array?
An SQL booking query to search X days either side of a date range
How to become a Game Programmer?
Automating dynamodb scripts
Angular Unit Test for passing form control
How to get comments and there sub comments from NodeJS sequel?
How to convert a list to code/lambda in scheme?
Database design issues due to concurrency in multi-node architecture
Scroll Context times out [Elasticsearch 5.6.7]
CancelIo doesn't represent actually cancelled I/Os
How to set SQS predefined_queues for Celery with Apache Airflow configuration?
Docker php—install php extensions ssh2
How to validate Spark SQL expression without executing it?
Is it possible to open azure portal resource link not in the default Directory
How to check if two boolean values are equal?
Replace a substring of file name
java how to convert a string to net.sf.json.JSONObject
How to compare two datepicker dates using jQuery
Regex for all PRINTABLE characters
How to replace all occurrences of a string in JavaScript

Posted: 30 Jun 2021 08:19 AM PDT

I'm a beginner at using Numpy. I already know how to make arrays via Numpy. (through numpy.array(list) :D)

But I got some wierd result when I ran the code like below:

a = numpy.ndarray(4)  print(a)

Then the result was:

[7.29205600e-312 6.95208261e-310 4.18967668e-321 3.79442416e-321]

So I'm curious at What is the role of " numpy.ndarray(int) " as a function, and What mechanism underlies the random-like number in the array.

How the CoreCLR resolves project dependencies?

Posted: 30 Jun 2021 08:19 AM PDT

Lately, I was working on a tool that verifies if a given binary could load the needed dependencies at runtime, for .NET Framework binaries the resolution of dependencies is pretty straightforward as mentioned here. For .NET Core i am still confused about the way CoreCLR locates and loads assemblies. As i know at the moment is that CoreCLR search in three different places :

Application base directory
Shared directory
Nuget cache

Resolution process relies on the presence of three Json files:

*.runtimeconfig.json file which specifies which version of .NET Core the application depends

{    "runtimeOptions": {      "tfm": "netcoreapp3.1",      "framework": {        "name": "Microsoft.NETCore.App",        "version": "3.1.0"      }    }  }

*.runtimeconfig.dev.json that contains additional probing paths

{    "runtimeOptions": {      "additionalProbingPaths": [        "C:\\Users\\yjirari\\.dotnet\\store\\|arch|\\|tfm|",        "C:\\Users\\yjirari\\.nuget\\packages"      ]    }  }

*.deps.json that lists dependencies of the application and their relative path to the Nuget cache

{    "runtimeTarget": {      "name": ".NETCoreApp,Version=v3.1",      "signature": ""    },    "compilationOptions": {},    "targets": {      ".NETCoreApp,Version=v3.1": {        "ConsoleApp1/1.0.0": {          "dependencies": {            "Newtonsoft.Json": "12.0.1"          },          "runtime": {            "ConsoleApp1.dll": {}          }        },        "Newtonsoft.Json/12.0.1": {          "runtime": {            "lib/netstandard2.0/Newtonsoft.Json.dll": {              "assemblyVersion": "12.0.0.0",              "fileVersion": "12.0.1.25517"            }          }        }      }    },    "libraries": {      "ConsoleApp1/1.0.0": {        "type": "project",        "serviceable": false,        "sha512": ""      },      "Newtonsoft.Json/12.0.1": {        "type": "package",        "serviceable": true,        "sha512": "sha512-ppPFpBcvxdsfUonNcvITKqLl3bqxWbDCZIzDWHzjpdAHRFfZe0Dw9HmA0+za13IdyrgJwpkDTDA9fHaxOrt20A==",        "path": "newtonsoft.json/12.0.1",        "hashPath": "newtonsoft.json.12.0.1.nupkg.sha512"      }    }  }

The *.runtimeconfig.json purpose is pretty obvious which is allowing the runtime to know which version of .NET Core the application was built against.

For *.runtimeconfig.dev.json is to add Nuget cache and other directories as additional probing directories But for .deps i am still confused about its goal.

To clarify the role of .deps, i did a little experiment using a console project that depends on Newtonsoft.Json v 12.0.1.

There are two main cases:

When .deps is present in bin folder:

If Newtonsoft.Json with same or superior version (e.g. v12, v13) exists also in bin, the application executes normally. Else the Newtonsoft.Json gets loaded from Nuget cache.

When the .deps is absent:

If Newtonsoft.Json with same or superior version exists also in bin, the application executes normally. Else a FileLoadException is raised

Conclusion :

I concluded from the result of my experiment that :

if the dependency is present in the bin folder with the same version or superior the dependency gets loaded.
The role of .deps appears only if the dependency doesn't exist in the bin folder, that helps loading dependency from the Nuget cache using the path of Nuget cache in *.runtimeconfig.dev.json concatenated to relative path specified in .deps.

My Questions are the following:

Why CoreClr loads dependencies with superior version of that referenced by the project ?
Is there any role of *.deps file while resolving dependencies other than locating assmeblies in Nuget cache ?
Is the CoreCLR parses the .deps before searching for dependencies?

I saw this two documents on resolution of CoreCLR but i didn't get too much about the role of .deps.

https://github.com/dotnet/cli/blob/v2.0.0/Documentation/specs/corehost.md https://docs.microsoft.com/en-us/dotnet/core/dependency-loading/default-probing

What are the risks of variables with protected visibility

Posted: 30 Jun 2021 08:18 AM PDT

I'm trying to implement a state pattern as explained in this example. I've gotten to code similar as below.

class State {  public:      virtual void enter() {};      virtual void update() = 0;      virtual void exit() {};        virtual void setContext(Context* cxt) {          this->context = cxt;      }  protected:      Context* context;  };    class Context {  public:      void do_something();      void do_something_else();        void transitionTo(std::unique_ptr<State> next_state) {          if (state != nullptr) {              state->exit();          }          state = std::move(next_state);          state->setContext(this);          state->enter();    private:      std::unique_ptr<State> state;  };    class ConcreteStateA {  public:      void update() override {          try {             context->do_something();          } catch {             context->transitionTo(std::unique_ptr<ConcreteStateB>());          }      }     };    class ConcreteStateB {   // ...  };

However, when i try to compile this with clang-tidy i get the following warning

error: member variable 'context' has protected visibility [cppcoreguidelines-non-private-member-variables-in-classes,-warnings-as-errors]

I have the following 2 questions:

What are the risks on giving a variable with protected visibility?
Does anyone have any suggestions on how to solve this error in a clean way? (I've tought about creating a protected getter method, but if I want to act upon the correct context i will have to return a reference or pointer, which has the same effect as this, but just requires extra code).

I'm unable to catch errors from an async functio

Posted: 30 Jun 2021 08:18 AM PDT

Given the following function to update values from the backend:

const updateValues = async (arg1, arg2, arg3) => {    try {      const response = await axios.patch(        ...      );      ...      return response.data;    } catch (err) {      console.error(err.response);      return {};    }  };

I'm have not been able to successfully react to the Promise function, here is my implementation:

const response = updateValues('bar', id, value1, value2, 'foo'  );  response    .then(() => console.log('success:', response))    .catch((err) => console.log('fail:', err));

What I'm getting is that regardless response always resolve and I never get to catch an error, what am I doing wrong?

How to Enter the Size of Array in main without knowing beforehand?

Posted: 30 Jun 2021 08:18 AM PDT

I'm trying to pass an array as a parameter using this function. In the function, I let the user enter how big the array is, as well as the values in said array. What I can't figure out is how to declare the variables in main that will allow me to use the function in main, and more specifically, how do I declare the array variable in main without knowing the size beforehand (user enters size in function).

void arrayFunction(int array1[], int arraySize);    int main() {        int arrayLength;      int arrayMain[];      cout << "Enter length of array: " << endl;      cin >> arrayLength;            arrayFunction(arrayMain, arrayLength);            return 0;  }  void arrayFunction(int array1[], int arraySize){      cout << "Enter length of array: " << endl;      cin >> arraySize;        for(int i = 0; i < arraySize; i++)      {          cout << "Enter value #" << i + 1 << endl;          cin >> array1[i];      }    }

How to selecet specific Midi Sequencer in Java?

Posted: 30 Jun 2021 08:18 AM PDT

I'm working on a midi program and want the user to have the option to select which midi sequencer is used if they have many instead of using MidiSystem.getSequencer().

My code looks like this and tries to loop through all devices and set the sequencer to a sequencer with a matching name to that selected in a drop down menu.

String name = (String) selector.getSelectedItem();        Sequencer sequencer = null;        MidiDevice device;      MidiDevice.Info[] infos = MidiSystem.getMidiDeviceInfo();      for (int i = 0; i < infos.length; i++) {        try {          device = MidiSystem.getMidiDevice(infos[i]);            if (device.getDeviceInfo().getName().equals(name)) {            //TODO: This line does not create a valid sequencer            if (deviceType == Device.SEQUENCER) {              sequencer = (Sequencer) device;            }           }        } catch (MidiUnavailableException e) {          System.out.println("Cannot locate device " + name);        }      }        if (sequencer != null) {        playbackClass.setSequencer(sequencer);        System.out.println("Sequencer Updated to " + name);      }

The print statement happens when it is executed but the payback class no longer can play any midis so I dont think the way I have casted the sequencer is correct.

why does SIGTERM handling mess up pool.map() handling?

Posted: 30 Jun 2021 08:18 AM PDT

When I add signal handling, namely SIGTERM to code that uses multiprocessing.pool.map() in a cycle, it causes it to hang after nondeterministic number of iterations, at least with Python 3.8 on Ubuntu.

#!/usr/bin/env python3                                                                                                                                              import threading                                                                  import signal                                                                     from multiprocessing import Pool                                                                                                                                                                                                                      def callme(num, frame):                                                               pass                                                                                                                                                            def worker():                                                                         pass                                                                                                                                                            numworkers = 8                                                                                                                                                      def runpool():                                                                        work_tasks = []                                                                   with Pool(processes=numworkers) as pool:                                              try:                                                                                  results = pool.map(worker, work_tasks, 1)                                     except Exception as exc:                                                              print("exc: {}".format(exc))                                                                                                                                                                                                              def worker_function():                                                                while True:                                                                           print("run")                                                                      runpool()                                                                                                                                                   args = []                                                                         my_thread = threading.Thread(target=worker_function, name="Sync thread",                                       args=args, daemon=True)                              my_thread.start()                                                                                                                                                   signal.signal(signal.SIGTERM, callme)                                             while True:                                                                           siginfo = signal.sigwaitinfo({signal.SIGTERM})                                    print("py: got %d from %d by user %d\n" % (siginfo.si_signo,                                                               siginfo.si_pid,                                                                   siginfo.si_uid))

Specifically, it is the signal.signal() call that leads to this behavior.

When run, this program prints bunch of "run" and hangs. The KeyboardInterrupt stack looks like this:

Traceback (most recent call last):    File "/usr/lib/python3.8/multiprocessing/process.py", line 315, in _bootstrap      self.run()    File "/usr/lib/python3.8/multiprocessing/process.py", line 108, in run      self._target(*self._args, **self._kwargs)    File "/usr/lib/python3.8/multiprocessing/pool.py", line 114, in worker      task = get()    File "/usr/lib/python3.8/multiprocessing/queues.py", line 355, in get      with self._rlock:    File "/usr/lib/python3.8/multiprocessing/synchronize.py", line 95, in __enter__      return self._semlock.__enter__()

Anyone know why is that ? Is this a limitation of Python itself or the above program is coded in a wrong way ?

I am aware that the pool handling sends SIGTERM to the pool workers via the context manager when the work is over so this behavior seems to be related.

Pandas skipping blank rows

Posted: 30 Jun 2021 08:18 AM PDT

I have a list of items and 1 of them is a blank ' '. I need to use this list of characters to check if the first character is one of the items before I need more cleaning up on the string.

My list is in a text.txt. When I wrote the text.txt and open it, I can see the ' ' is there. But when I read it with pd.read_csv, the blank row is skipped. Is this a limitation of pandas?

Below is an example of my code reading back the text file.

mylist = ['A','B',' ','D']  mylist = pd.DataFrame(mylist, columns = ['checklist']  mylist.to_csv('checklist.txt', index = False)    mylist2 = pd.read_csv('checklist.txt')

After the last line, I'm supposed to convert the series back to list but because my ' ' row is missing from the data frame.

Please help.

xamarin.forms: AppTrackingTransparency no longer found in PCL only in ios project [duplicate]

Posted: 30 Jun 2021 08:18 AM PDT

Yesteday, xamarin.ios received an update. Since then, I cannot use code referencing app tracking transparency anymore.

THis line in my PCL class:

using AppTrackingTransparency;

just isnt found anymore. I tried restarting, I tried reinstalling, I tried deleting bin and obj folder, I tried rebuilding, cleaning all to no avail.

I realised however, that this is total found in iOS app delegate class.

I can simply go:

ATTrackingManager.RequestTrackingAuthorization((status) => {                      if (status == ATTrackingManagerAuthorizationStatus.Authorized)                      {                        }                      else if (status == ATTrackingManagerAuthorizationStatus.Denied)                      {                          Device.BeginInvokeOnMainThread(async () =>                          {                          });                      }                  });

From app delegate and the window with the pop up appears.

However, we had this popup integrated into the pcl workflow. Since this is no longer the case, the pop up now will just show on app launch and then never again.

1.) Why TF is this library suddenly missing from pcl?

2.) can i call a function from pcl from app delegate?

I dont see any other solution to this.

Error in for-loop for simulating data with negative difference scores as rows

Posted: 30 Jun 2021 08:18 AM PDT

I want to run a for loop to simulate some data, but run into an error. The general idea is that I want to simulate the formula: (D-A)/K.

D and A can vary between values of 0 and 10. K can vary between values of 0 and 150. I want an output matrix/dataframe, with the difference for every possible combination of D-A on the rows (21 rows, varying between a value of -10 and +10), for each level of K (150 columns).

I wrote this piece of code:

D = c(0:10)  A = c(0:10)  K = c(0:150)    output=matrix(NA)    for (d in 0:10){    for (a in 0:10){      for (k in 0:150){        output[i-j,k]=(D[d]-A[a])/K[k]      }    }  }

But it doesn't work, I think because subsetting [i-j] gives a negative row number in some instances (for example D[2] - A[5]). I don't know how to work around this issue. The values of A, D, and K are fixed.

How to create a Binary Tree from a array?

Posted: 30 Jun 2021 08:18 AM PDT

I would like to create a Binary Tree from an array.

The input is an array:

const arr1 = [3, 5, 1, 6, 2, 9, 8, null, null, 7, 4];  const arr2 = [3, 5, 1, 6, 7, 4, 2, null, null, null, null, null, null, 9, 8];

The image of the tree be like below:

The output is a Tree Structure as below:

const expectedOutput = {    val: 3,    left: {      val: 5,      left: {        val: 6,        left: null,        right: null,      },      right: {        val: 2,        left: { val: 7, left: null, right: null },        right: { val: 4, left: null, right: null },      },    },    right: {      val: 1,      left: {        val: 9,        left: null,        right: null,      },      right: {        val: 8,        left: null,        right: null,      },    },  };

So far I tried:

const arr1 = [3, 5, 1, 6, 2, 9, 8, null, null, 7, 4];  const arr2 = [3, 5, 1, 6, 7, 4, 2, null, null, null, null, null, null, 9, 8];    function TreeNode(val, left, right) {    this.val = val === undefined ? 0 : val;    this.left = left === undefined ? null : left;    this.right = right === undefined ? null : right;  }    function createTree(arr, rootIndex = 0, childIndex = 0) {    if (arr[rootIndex] === null) {      return null;    } else if (childIndex + 1 < arr.length) {      childIndex++;      const leftChildIndex = childIndex;      childIndex++;      const rightChildIndex = childIndex;      return new TreeNode(        arr[rootIndex],        createTree(arr, leftChildIndex, childIndex),        createTree(arr, rightChildIndex, childIndex)      );    } else {      return new TreeNode(arr[rootIndex], null, null);    }  }    const output = createTree(arr1);    console.log(output);

Thanks.

An SQL booking query to search X days either side of a date range

Posted: 30 Jun 2021 08:18 AM PDT

I've got an SQL query that looks up available holiday lets based on a start/end date query. I now need to query both the original range and also those consisting of the same number of days but shifted +/- 1 or up to 7 days either side of the original range to see if a booking could be accommodated then. The idea is to achieve something similar to Airbnb's buttons to check availability either side of your date selection. So if the original range was 2021-06-23 to 2021-06-30, I'd also like to return available results for, say, one day either side of that range; 2021-06-22 to 2021-06-29 or 2021-06-24 to 2021-07-01.

The table layout is:

Property table:

CREATE TABLE `tbl_properties` (    `id` bigint(20) UNSIGNED NOT NULL,    `post_id` bigint(20) UNSIGNED NOT NULL,    `name` char(32) NOT NULL,    `num_persons` int(11) NOT NULL,    `num_children` int(11) NOT NULL  ) ENGINE=InnoDB DEFAULT CHARSET=latin1;    INSERT INTO `properties` (`id`, `post_id`, `name`, `num_persons`, `num_children`) VALUES  (1, 33741, 'House 1', 2, 0),  (2, 31404, 'House 2', 2, 2);

The bookings table:

CREATE TABLE `tbl_bookings` (    `id` bigint(20) UNSIGNED NOT NULL,    `ical_id` bigint(20) UNSIGNED NOT NULL,    `start_date` datetime NOT NULL,    `end_date` datetime NOT NULL,    `archived` tinyint(1) NOT NULL DEFAULT 0  ) ENGINE=InnoDB DEFAULT CHARSET=latin1;    INSERT INTO `tbl_bookings` (`id`, `ical_id`, `start_date`, `end_date`, `archived`) VALUES  (1, 1, '2021-06-23 13:00:00', '2020-06-30 12:00:00', 0),  (1, 1, '2021-06-30 13:00:00', '2020-07-07 12:00:00', 0),  (2, 2, '2021-08-07 13:00:00', '2020-08-17 12:00:00', 0);

My existing query looks up available properties looks like this:

SELECT          DISTINCT(p.post_id)      FROM          tbl_properties p          LEFT JOIN tbl_bookings b ON h.id = b.ical_id          AND b.archived = '0'          AND (              '2021-06-23 13:00:00' BETWEEN b.start_date AND b.end_date              OR '2021-06-30 13:00:00' BETWEEN b.start_date AND b.end_date              OR b.start_date BETWEEN '2021-06-23 13:00:00' AND '2021-06-30 12:00:00'              OR b.end_date BETWEEN '2021-06-23 13:00:00' AND '2021-06-30 12:00:00'          )      WHERE          b.id IS NULL          AND(              h.num_persons >= '0'              AND (                  h.num_children >= '0'                  OR (                      tr.term_taxonomy_id = '51'                      AND (                          h.num_persons + h.num_children                      ) >= '0'                  )              )          )

This returns available properties with no entries on the right bookings table for the date range via b.id IS NULL.

I've tried adding more ranges to the clauses, one for each range shifted X days, attached to the join but only succeeded in reducing the number of properties returned instead of increasing the number returned as expected.

So, ideally if I query availability 2021-06-23 to 2021-06-30 both example properties would be returned if I were to request +/- 7 days on that range.

If anybody could point me in the right direction I'd be extremely grateful. I hope I've managed to describe the problem without causing too much confusion :)

How to become a Game Programmer?

Posted: 30 Jun 2021 08:18 AM PDT

I'm a 2nd-year student doing my bachelor's degree in Computer Applications. I'm really interested in becoming a Game Programmer and my dream is to work at Rockstar Games😅(it sounds a bit crazy). I'm really worried and I don't know how to program or write code. I'm an absolute beginner. Do you have any tips or advice on how to get there and fulfill my dream? Thanks😀.

Automating dynamodb scripts

Posted: 30 Jun 2021 08:18 AM PDT

Like we used to do with rdbms sql scripts. I wanted to do a similar thing with my dynamodb table.

Currently its very difficult to track changes from environment to environment(dev - qa -prod). We are directly making changes via the console.

What I want to do is, Keep the table data/json in the git version control and whenever any dev makes a change, we should be able to just run a script that will be able migrate the respective changes to on the dynamodb table eg. update/create/delete the tables, add/remove/update the records.

But I am not able to find a proper way/guide to achieve this currently. I am using javascript/nodejs as our base language.

Any help regarding this scenario will be appriciable.

Thanks

ref : https://forums.aws.amazon.com/thread.jspa?threadID=342538

Angular Unit Test for passing form control

Posted: 30 Jun 2021 08:18 AM PDT

I am still new in Angular and Jasmine for unit testing. If code in component.ts like this

regexObj = {      numberRegex: '^\d+$',      nameRegex: '^[a-zA-Z0-9-_. ]*$',      vendorRegex: '^[a-zA-Z0-9-_. ]*$',      longLatRegex: '^[0-9.-]*$',      priceRegex: '^[0-9]*$'  };      setBillboardIdValidator(index: number) {       this.getForms().controls[index].get('billboardId').setValidators([Validators.pattern(this.regexObj.nameRegex), Validators.required]);  }      getForms(): FormArray {      return this.formGroup.get('forms') as FormArray;  }

how I write setBillboardIdValidator() and getForms() in component.spec.ts ?

In component.spec.ts, I create like this

beforeEach(() => {    fixture = TestBed.createComponent(UploadFormComponent);    component = fixture.componentInstance;    component.form = new FormGroup({      billboardName: new FormControl('', Validators.required)    });  });    it('should return x control', () => {      const control = component.form.controls['setBillboardIdValidator'];      expect(control).toBeTruthy();  });

How to get comments and there sub comments from NodeJS sequel?

Posted: 30 Jun 2021 08:18 AM PDT

I want to get sub comments of a main comment base on there parent id in NodeJS sequel I want to get comment and sub comments as

[   {  "id" : 1,  "name" : "harry",  "comment" : "good luck"  "sub comments": [                {                 "id" : 4,                 "name": "khan",                 "comment": "thanks"                }               ]  "id" : 2,  "name": tom,  "comment": nice,  "sub comments" [              {                'no sub comment of this comment'              }             ]   }  ]

How to convert a list to code/lambda in scheme?

Posted: 30 Jun 2021 08:19 AM PDT

Let's say I have a list, for example: (define a '(+ (* p p) (* x x))).

How can I define a procedure with an expression given by a, for example here: (define (H x p) (+ (* p p) (* x x)))) ?

I tried to do something like: (define (H x p) (eval a)) but it says that p and x are undefined. I guess, there's a simple workaround with apply or something similar, but I can't wrap my head around it.

I guess I could just modify p and x in my list according to the value passed to the procedure H and then evaluate the new list, but it's kind of ugly... Or maybe there's a pretty way to do this?

Database design issues due to concurrency in multi-node architecture

Posted: 30 Jun 2021 08:18 AM PDT

I will try my best to go through the issue. Please let me know if there needs any clarity.

Environment:
Application is deployed on AWS, with multiple instances connected to a single datastore.
The data store consists of tables,

Legacy tables:

instance_info (id, instance_details, ...)  task_info (id, task_id, ...)

Newly added table:

new_table (id, instance_info_id, task_info_id, ...)

Schema design:

id - in all the tables are the PKs.
In new_table the column,
- task_info_id is a foreign key to table task_info and,
- instance_info_id is to table instance_info.
- A unique constraint exists on columns instance_info_id & task_info_id.

Problem:
When the code executes, it divides (forks) its operation into multiple threads that execute independently & in parallel. On completion, these threads join and try to insert data into one of the legacy tables - "task_info".
Now, there could be a situation where these multiple threads (running concurrently on a single node), will successfully populate multiple entries into the table.

Requirement:
If there are multiple threads, working in parallel, then only one thread INSERTs a record into the table "task_info", while the other threads only update it.

Limitations:

cannot add unique constraints to the task_info table as this approach ruins an existing (legacy code) functionality for the retrying mechanism.
Cannot lock the whole table during write operation as this could end up creating performance issues for us.
A thoughtful approach of using a "write-through" mechanism (distributed Memcache) however, there seems to be a doubt if we take downtime into consideration, that could lead to data loss.

Is there any efficient design approach (with minimal/no changes in the legacy code/design) that can be looked into?

Scroll Context times out [Elasticsearch 5.6.7]

Posted: 30 Jun 2021 08:18 AM PDT

I am building a worker function to be used to scroll through the data in an Elasticsearch cluster. I am using the _search and scroll API and concurrent.futures in python with the slice query for parallelism. I have managed to pull data successfully for a whole month but something seems to be timing out my scroll context prematurely for the last 2 days and can't figure out what. The indexes are grouped in gameVersion.eventNameAsIndex but in the search query I just use gameVersion.* to bring all indexes for that version. We have about 4 versions and for all except one we have this issue for the last 2 days.

This is the scroller function:

def slicedScroller(Index,                     threadNr,                     maxThreads,                     lastRowDateOffset,                     endDate,                     maxProcessTimeSeconds,                     maxSizeMB,                     esClient,                     sid=''):            startScroll = time.perf_counter()      body = {          "slice": {              "id": threadNr,               "max": maxThreads,              "field": "baseCtx.date"          },          "query": {              "bool": {                  "must": [{                      "exists": {                          "field": "baseCtx.date"                      }                  },{                      "range": {                          "baseCtx.date": {                              "gt": lastRowDateOffset,                              "lt": endDate                          }                      }                  }]              }          }      }        data = []      dataBytes = b''      lastThreadDate = ''      threadTotalHits = 0            while 1:          if sid == '':              page = esClient.search(                  index = Index,                  sort = "baseCtx.date:asc",                  scroll = scroll_time,                  size = scroll_size,                  body = body              )          elif sid == "nPage":              break          elif sid[:5] == "lPage":              page = esClient.scroll(body={"scroll": scroll_time, "scroll_id": sid[5:]})          else:              page = esClient.scroll(body={"scroll": scroll_time, "scroll_id": sid})          data += page['hits']['hits']          sid = page["_scroll_id"]          threadTotalHits = page['hits']['total']                    if len(page['hits']['hits']) == 0 and sid[:5] == "lPage":              sid = "nPage"              break          if len(page['hits']['hits']) < scroll_size: # If the length of the page is below scroll_size then it is the last page in the scroll              sid = "lPage" + sid              break          if time.perf_counter() - startScroll > maxProcessTimeSeconds: break # If the amount of time it took to scroll over the slice is bigger than maxProcessingTimeSeconds          if len(data) > rowsPerMB * maxSizeMB / maxThreads: break # about 12000 rows per compressed MB. If the number of rows is over the amount required to reach maxSizeMB when combining all slices          # Else continue scrolling            if len(data) != 0:          dataBytes = gzip.compress(bytes(json.dumps(data)[1:-1], encoding='utf-8'))          lastThreadDate = max([x['_source']['baseCtx']['date'] for x in data])            response = {          "sid": sid,          "threadNr": threadNr,          "dataBytes": dataBytes,          "lastThreadDate": lastThreadDate,          "threadTotalHits": threadTotalHits,          "threadPulledSize": len(data)      }            return response

The slicedScroller() function is called below and results compiled in one file that is then uploaded to a blob storage:

def batch(gameVersion, env='prod', startDate='auto', endDate='auto', writeDate=True):            # #### Global Variables      env = env.lower()      lowerFormat = gameVersion.lower().replace(" ","_")      azFormat = re.sub(r'[^0-9a-zA-Z]+', '-', gameVersion).lower()      storageContainerName = azFormat      curFileName = f"{lowerFormat}_cursor.py"      curTempFilePath = os.path.join(tempFilePath,curFileName)      curBlobFilePath = f"cursors/{curFileName}"      sids = []      compressedTools = [gzip.compress(bytes('[', encoding='utf-8')), gzip.compress(bytes(',', encoding='utf-8')), gzip.compress(bytes(']', encoding='utf-8'))]        # Parameter and state settings      if os.getenv(f"{lowerFormat}_maxSizeMB") is not None: maxSizeMB = int(os.getenv(f"{lowerFormat}_maxSizeMB"))      if os.getenv(f"{lowerFormat}_maxThreads") is not None: maxThreads = int(os.getenv(f"{lowerFormat}_maxThreads"))      if os.getenv(f"{lowerFormat}_maxProcessTimeSeconds") is not None: maxProcessTimeSeconds = int(os.getenv(f"{lowerFormat}_maxProcessTimeSeconds"))      esClient = es(maxThreads)            Index = lowerFormat + ".*"      if env == 'dev': Index = 'dev.' + Index            try:          cur = getAndLoadCursor(curBlobFilePath, curTempFilePath)      except Exception as e:          dtStr = f"{datetime.datetime.utcnow():%Y/%m/%d %H:%M:00}"          #writeCursor(curBlobFilePath, f"# Please use format YYYY/MM/DD HH24:MI:SS\nlastPolled = '{dtStr}'")          logging.info(f"No cursor file. Generated {curFileName} file with date {dtStr}")          print(e)          return             # # Scrolling and Batching Engine      if startDate == 'auto':          lastRowDateOffset = cur.lastPolled      else:          lastRowDateOffset = startDate        while 1:          # Offset the current time by -5 minutes to account for the 2-3 min delay in Elasticsearch          initTime = datetime.datetime.utcnow()          if endDate == 'auto': endDate = f"{initTime-datetime.timedelta(minutes=minutesOffset):%Y/%m/%d %H:%M:%S}"                    dataBytes = []          dataSize = 0            start = time.perf_counter()                    with concurrent.futures.ThreadPoolExecutor() as executor:              ids = list(range(maxThreads))              if len(sids) == 0: sids = ['' for x in range(maxThreads)]              #print(f"Posted SIDs: {sids}")              results = [                  executor.submit(                      slicedScroller,                      Index,                      _id,                      maxThreads,                      lastRowDateOffset,                      endDate,                      maxProcessTimeSeconds,                      maxSizeMB,                      esClient,                      sid                  ) for _id, sid in zip(ids, sids)              ]                for f in concurrent.futures.as_completed(results):                  if f.result()["sid"][:5] != "nPage":                      lastRowDateOffset = max(lastRowDateOffset, f.result()["lastThreadDate"])                      dataSize += f.result()["threadPulledSize"]                        if len(f.result()["dataBytes"]) > 0: dataBytes.append(f.result()["dataBytes"])                        sids[f.result()["threadNr"]] = f.result()["sid"]                      if f.result()["sid"][:5] in ("nPage", "lPage"):                          sidTest = f.result()["sid"][:5]                      else:                          sidTest = 'newSid'                      print(f"Thread {f.result()['threadNr']} -- Results pulled {f.result()['threadPulledSize']} -- Cumulative Results: {dataSize} -- Process Time: {round(time.perf_counter()-start, 2)} sec -- SID: {sidTest}") #sidTest            if dataSize == 0: break          lastRowDateOffsetDT = datetime.datetime.strptime(lastRowDateOffset, '%Y/%m/%d %H:%M:%S')          outFile = f"elasticsearch/live/{lastRowDateOffsetDT:%Y/%m/%d/%H}/{lowerFormat}_live_{lastRowDateOffsetDT:%Y%m%d%H%M%S}_{datetime.datetime.utcnow():%Y%m%d%H%M%S}.json.gz"                    print(f"Starting compression of {dataSize} rows -- {round(time.perf_counter()-start, 2)} sec")          dataBytes = compressedTools[0] + compressedTools[1].join(dataBytes) + compressedTools[2]                    print(f"Comencing to upload data to blob -- {round(time.perf_counter()-start, 2)} sec")          #uploadJsonGzipBlobBytes(outFile, dataBytes, storageContainerName, len(dataBytes))                    logging.info(f"File compiled: {outFile} -- {dataSize} rows -- Process Time: {round(time.perf_counter()-start, 2)} sec")          print(f"File compiled: {outFile} -- {dataSize} rows -- Process Time: {round(time.perf_counter()-start, 2)} sec\n")                    if len(set(sids)) == 1 and list(set(sids))[0] == "nPage": break            if writeDate: 1==1#writeCursor(curBlobFilePath, f"# Please use format YYYY/MM/DD HH24:MI:SS\nlastPolled = '{lastRowDateOffset}'")      logging.info(f"Closing Connection to {esUrl}")      print(f"Closing Connection to {esUrl}")      esClient.clear_scroll(scroll_id="_all")      esClient.close()      return

The scroll_time="25m" and scroll_size=10000.

Finally when I call batch("Game Version with Problems"), sometimes it manages to scroll through and compile at least a file but in most cases, after one scroll I get an error.

NotFoundError: NotFoundError(404, 'search_phase_execution_exception', 'No search context found for id [52908399]')

So I did a bit of digging and printed out the scroll_id's used while running and manually POST-ed them on Kibana's dev. It worked and returned the documents requested but even though I stated scroll_time="25m", the same POST request returned the below after 1 minute so I'm guessing that is a timeout:

{    "error": {      "root_cause": [        {          "type": "search_context_missing_exception",          "reason": "No search context found for id [51779718]"        },        {          "type": "search_context_missing_exception",          "reason": "No search context found for id [51779738]"        },...],      "type": "search_phase_execution_exception",      "reason": "all shards failed",      "phase": "query",      "grouped": true,      "failed_shards": [        {          "shard": -1,          "index": null,          "reason": {            "type": "search_context_missing_exception",            "reason": "No search context found for id [51779718]"          }        },        {          "shard": -1,          "index": null,          "reason": {            "type": "search_context_missing_exception",            "reason": "No search context found for id [51779738]"          }        },...],      "caused_by": {        "type": "search_context_missing_exception",        "reason": "No search context found for id [51773825]"      }    },    "status": 404  }

I've checked cluster health and all shards and nodes are green. I've changed the number of threads to 2, 10, then 30, size to 10000 and 5000, scroll time to "1m", "5m", "10m" and "30m". Can't understand how to fix this, why is it happening only on some indexes and why did it start last within the last 2 days.

CancelIo doesn't represent actually cancelled I/Os

Posted: 30 Jun 2021 08:18 AM PDT

I wrote a little program. It issues a number of asynchronous reads to a file which are to be caught by a I/O completion port. Then I randomly cancel all I/Os to the file by CancelIo (which each overlapped-structure individually and not for the whole file-handle at once). Thereby I count the number of successful cancels and the cancels where I get a GetLastError() of ERROR_NOT_FOUND because the I/O-operation has been completed. Then I do GetQueuedCompletionStatus() for all issued I/Os. I count all successful I/Os and all I/Os which are cancelled (GetLastError() == ERROR_OPERATION_ABORTED). Interestingly the number of completed and cancelled I/Os isn't the same than I counted while issuing the CancelIo's. So why is there a difference? Here's my test-program:

#include <Windows.h>  #include <exception>  #include <iostream>  #include <cstring>  #include <vector>  #include <cstdint>  #include <cstdlib>  #include <system_error>  #include <utility>  #include <random>    using namespace std;    int main( int argc, char **argv )  {      auto throwSysErr = []( char const *str )      {          throw system_error( error_code( (int)GetLastError(), system_category() ), str ? str : "" );      };      try      {          if( argc < 3 )              throw exception( "parameters !" );          size_t blocks = atoi( argv[2] );          if( blocks == 0 )              throw exception( "blocks == 0" );          HANDLE hFile = CreateFileA( argv[1], GENERIC_READ | GENERIC_WRITE, 0, nullptr, CREATE_ALWAYS, FILE_FLAG_OVERLAPPED | FILE_FLAG_NO_BUFFERING, NULL );          if( !hFile )              throwSysErr( "cf failed" );          size_t const  BLOCKSIZE = 4096;          void         *pMem = VirtualAlloc( nullptr, BLOCKSIZE, MEM_RESERVE | MEM_COMMIT, PAGE_READWRITE );          if( !pMem )              throwSysErr( "va failed" );          OVERLAPPED ol;          if( !(ol.hEvent = CreateEvent( nullptr, FALSE, FALSE, nullptr )) )              throwSysErr( "ce failed" );          uint64_t offset = (uint64_t)(blocks - 1) * BLOCKSIZE;          ol.Offset       = (DWORD) offset;          ol.OffsetHigh   = (DWORD)(offset >> 32);          if( !WriteFile( hFile, pMem, BLOCKSIZE, nullptr, &ol ) &&              GetLastError() != ERROR_IO_PENDING )              throwSysErr( "wf failed " );          DWORD dwTransferred;          if( !GetOverlappedResult( hFile, &ol, &dwTransferred, TRUE ) ||              dwTransferred != BLOCKSIZE )              throwSysErr( "gor failed " );          HANDLE hIocp = CreateIoCompletionPort( hFile, NULL, 0x12345678, 1 );          if( !hIocp )              throwSysErr( "ciocp failed" );          vector<OVERLAPPED> vol( blocks );          for( size_t b = 0; b != blocks; ++b )          {              uint64_t offset   = (uint64_t)b * BLOCKSIZE;              vol[b].Offset     = (DWORD) offset;              vol[b].OffsetHigh = (DWORD)(offset >> 32);              vol[b].hEvent     = NULL;              if( !ReadFile( hFile, pMem, BLOCKSIZE, nullptr, &vol[b] ) && GetLastError() != ERROR_IO_PENDING )                  throwSysErr( "wf failed " );          }          vector<size_t> blockReorderList( blocks );          for( size_t b = 0; b != blocks; ++b )              blockReorderList[b] = b;          mt19937_64                       mt;          uniform_int_distribution<size_t> uidBlock( 0, blocks - 1 );          for( size_t b = 0; b != blocks; ++b )              swap( blockReorderList[b], blockReorderList[uidBlock( mt )] );          unsigned cancelSucceeded = 0,                   cancelCompleted = 0;          for( size_t b = 0; b != blocks; ++b )              if( CancelIoEx( hFile, &vol[blockReorderList[b]] ) )                  ++cancelSucceeded;              else                  if( GetLastError() == ERROR_NOT_FOUND )                      ++cancelCompleted;                  else                      throwSysErr( "cio failed " );          cout << "cancel succeeded:         " << cancelSucceeded << endl;          cout << "cancel already completed: " << cancelCompleted << endl;          uint32_t completedCounter   = 0,                   cancelledCounter   = 0;          for( size_t c = 0; c != blocks; ++c )          {              DWORD       dwBytesTransferred = 0;              ULONG_PTR   ulpCompletionKey;              OVERLAPPED *pol;              if( GetQueuedCompletionStatus( hIocp, &dwBytesTransferred, &ulpCompletionKey, &pol, INFINITE ) )                  ++completedCounter;              else                  if( GetLastError() == ERROR_OPERATION_ABORTED )                      ++cancelledCounter;                  else                      throwSysErr( "gqcs failed" );          }          cout << "completed:     " << completedCounter   << endl               << "cancelled:     " << cancelledCounter   << endl;          cout << endl;      }      catch( exception &exc )      {          cout << exc.what() << endl;          return EXIT_FAILURE;      }      return EXIT_SUCCESS;  }

How to set SQS predefined_queues for Celery with Apache Airflow configuration?

Posted: 30 Jun 2021 08:18 AM PDT

I am configuring Apache Airflow using celery with Amazon SQS. I understand that Celery allows for broker_transport_options (https://docs.celeryproject.org/en/stable/getting-started/brokers/sqs.html) and Airflow contains a section in its config called celery_broker_transport_options.

I understand that I am able to pass simple strings in the Airflow config. For example, in. the celery_broker_transport section, I could pass:

region = us-west-1

which would be the equivalent of saying to celery:

broker_transport_options = {'region': 'us-west-1'}

I am trying to pass the predefined_queues option in Airflow, which looks like the following in Celery:

broker_transport_options = {      'predefined_queues': {          'my-q': {              'url': 'https://ap-southeast-2.queue.amazonaws.com/123456/my-q',              'access_key_id': 'xxx',              'secret_access_key': 'xxx',          }      }  }

I am unsure how to pass this information to Airflow. I have tried the following, and I get an error saying that 'str' object has no attribute 'items':

predefined_queues = 'my-q': { 'url': 'https://sqs.us-east-1.amazonaws.com/1234567890/my-q', }

Docker php—install php extensions ssh2

Posted: 30 Jun 2021 08:19 AM PDT

I am using the official php docker image as base for my application container, so the Dockerfile starts like so:

FROM php:5.6-fpm-jessie

Later in the file I would like to have something like that:

 RUN apt-get update \      && apt-get install -y libssh2-1-dev libssh2-1 \      && docker-php-ext-install ssh2

But that tells me:

/usr/src/php/ext/ssh2 does not exist

So since that is a debian (yessie) based image, only the old php5 packages are available and php7 is installed by some tricky script in the php:fpm dockerfile and it seams that all extensions are compiled within the used php executable.

How can I install more extensions in this scenario?

How to validate Spark SQL expression without executing it?

Posted: 30 Jun 2021 08:18 AM PDT

I want to validate if spark-sql query is syntactically correct or not without actually running the query on the cluster.

Actual use case is that I am trying to develop a user interface, which accepts user to enter a spark-sql query and I should be able to verify if the query provided is syntactically correct or not. Also if after parsing the query, I can give any recommendation about the query with respect to spark best practices that would be best.

Is it possible to open azure portal resource link not in the default Directory

Posted: 30 Jun 2021 08:18 AM PDT

For user who has access to multiple directories (see screen shot below)

For azure web app I can generate link like below:

https://ms.portal.azure.com/#resource/{resourceId}/DeploymentSource

If the resource is in my default Directory, I can paste the link to the browser and it will open the right blade.

If I paste a link to a resource that is not in the default directory for the user, then I get the following:

However, If I first got to the root of portal.azure.com and switch directory to the targt webapp, then paste the link to the blade, then it works.

Is this possible to tell azure portal to switch directory based on the resource in question. Btw, this is for code that we are writing that is running outside the azure portal hosting frame (hence the desire to open specific blade for a given web app)

Thanks

How to check if two boolean values are equal?

Posted: 30 Jun 2021 08:19 AM PDT

I need a method which I can call within the junit assertTrue() method which compares two booleans to check if they are equal, returning a boolean value. For example, something like this:

boolean isEqual = Boolean.equals(bool1, bool2);

which should return false if they are not equal, or true if they are. I've checked out the Boolean class but the only one that comes close is Boolean.compare() which returns an int value, which I can't use.

Replace a substring of file name

Posted: 30 Jun 2021 08:18 AM PDT

Sorry if this question has been already asked before. I didn't find an answer by searching. In need to replace a sub string of a file name in Python.

Old String: "_ready"

New String: "_busy"

Files: a_ready.txt, b_ready.txt, c.txt, d_blala.txt, e_ready.txt

Output:a_busy.txt, b_busy.txt, c.txt, d_blala.txt, e_busy.txt

Any ideas? I tried to use replce(), but nothing happen. The files are still with the old names.

Here is my code:

import os    counter = 0    for file in os.listdir("c:\\test"):      if file.endswith(".txt"):          if file.find("_ready") > 0:              counter = counter + 1              print ("old name:" + file)              file.replace("_ready", "_busy")              print ("new name:" + file)  if counter == 0:      print("No file has been found")

java how to convert a string to net.sf.json.JSONObject

Posted: 30 Jun 2021 08:18 AM PDT

I get tweets and use org.json.simple api to convert a string to a object.

JSONParser jsonParser = new JSONParser();  Object json = jsonParser.parse(in);

and I would like to insert the obj into couchdb using couchdb4j api

 Session myDbSession = new Session("localhost",5984)      Database myCouchDb = myDbSession.getDatabase("db-name");      Document newdoc = new Document();      Document newdoc = new Document(JSONObject json);      myCouchDb.saveDocument(newdoc);

The error is:

   org.json.simple.JSONObject cannot be cast to net.sf.json.JSONObject

how to solve this problem or anyone can give a solution to insert a json format string or object into couchdb

How to compare two datepicker dates using jQuery

Posted: 30 Jun 2021 08:19 AM PDT

This is my code:

var $from = $("#fromDate").datepicker('getDate');  var $to = $("#toDate").datepicker('getDate');  if($from > $to)     alert("from date shouldn't greater than To date");

It is working if it is two dates in the same year. Otherwise, for example fromDate='1/12/2012'(dd/mm/yyyy) toDate='18/6/2013'(dd/mm/yyyy), while you check the condition, it is not working. It throws an alert, which is given.

Regex for all PRINTABLE characters

Posted: 30 Jun 2021 08:18 AM PDT

Is there a special regex statement like \w that denotes all printable characters? I'd like to validate that a string only contains a character that can be printed--i.e. does not contain ASCII control characters like \b (bell), or null, etc. Anything on the keyboard is fine, and so are UTF chars.

If there isn't a special statement, how can I specify this in a regex?

How to replace all occurrences of a string in JavaScript

Posted: 30 Jun 2021 08:18 AM PDT

I have this string in my JavaScript code:

"Test abc test test abc test test test abc test test abc"

Doing:

str = str.replace('abc', '');

Seems to only remove the first occurrence of abc in the string above.

How can I replace all occurrences of it?

e-techbytes

Wednesday, June 30, 2021

Recent Questions - Stack Overflow

Recent Questions - Stack Overflow

No comments:

Post a Comment